@c -*- mode: texinfo; coding: utf-8 -*-
@c %**start of header
@setfilename ra-ra.info
@documentencoding UTF-8
@settitle ra:: — An array library for C++20
@c %**end of header

@c Keep track of
@c http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0834r0.html
@c http://open-std.org/JTC1/SC22/WG21/docs/papers/2017/p0573r2.html
@c http://open-std.org/JTC1/SC22/WG21/docs/papers/2017/p0356r2.html
@c References to source [ma··] or [ma···] current last is 117.

@set VERSION 25
@set UPDATED 2023 November 1

@copying
@code{ra::} (version @value{VERSION}, updated @value{UPDATED})

(c) Daniel Llorens 2005--2023

@smalldisplay
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
@end smalldisplay
@end copying

@dircategory C++ libraries
@direntry
* ra-ra: (ra-ra.info).  Expression template and multidimensional array library for C++.
@end direntry

@include my-bib-macros.texi
@mybibuselist{Sources}

@titlepage
@title ra::
@subtitle version @value{VERSION}, updated @value{UPDATED}
@author Daniel Llorens
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage

@ifnottex
@node Top
@top @code{ra::}

@insertcopying

@code{ra::}@footnote{/ə'ɹ-eɪ/, I guess.} is a general purpose multidimensional array and expression template library for C++20. Please keep in mind that this manual is a work in progress. There are many errors and whole sections unwritten.

@menu
* Overview::          Array programming and C++.
* Usage::             Everything you can do with @code{ra::}.
* Extras::            Additional libraries provided with @code{ra::}.
* Hazards::           User beware.
* Internals::         For all the world to see.
* The future::        Could be even better.
* Reference::         Systematic list of types and functions.
* @mybibnode{}::      It's been done before.
* Indices::           Or try the search function.
* Notes::             Technically...
@end menu

@end ifnottex

@iftex
@shortcontents
@end iftex

@c ------------------------------------------------
@node Overview
@chapter Overview
@c ------------------------------------------------

@cindex length
@cindex rank
@cindex shape
A multidimensional array is a container whose elements can be looked up using a multi-index (i₀, i₁, ...). Each of the indices i₀, i₁, ... has a constant range [0, n₀), [0, n₁), ... independent of the values of the other indices, so the array is ‘rectangular’. The number of indices in the multi-index is the @dfn{rank} of the array, and the list of all the @dfn{lengths} (n₀, n₁, ... nᵣ₋₁) is the @dfn{shape} of the array. We speak of a rank-@math{r} array or of an @math{r}-array.

Often we deal with multidimensional @emph{expressions} where the elements aren't stored anywhere, but are computed on demand when the expression is looked up. In this general sense, an ‘array’ is just a function of integers with a rectangular domain.

Arrays (as a representation of @dfn{matrices}, @dfn{vectors}, or @dfn{tensors}) are common objects in math and programming, and it is very useful to be able to manipulate arrays as individual entities rather than as aggregates. Not only is

@verbatim
A = B+C;
@end verbatim

much more compact and easier to read than

@verbatim
for (int i=0; i!=m; ++i)
    for (int j=0; j!=n; ++j)
        for (int k=0; k!=p; ++k)
            A(i, j, k) = B(i, j, k)+C(i, j, k);
@end verbatim

but it's also safer and less redundant. For example, the order of the loops may be something you don't really care about.

However, if array operations are implemented naively, a piece of code such as @code{A=B+C} may result in the creation of a temporary to hold @code{B+C} which is then assigned to @code{A}. This is wasteful if the arrays involved are large.

@cindex Blitz++
Fortunately the problem is almost as old as aggregate data types, and other programming languages have addressed it with optimizations such as @url{https://en.wikipedia.org/wiki/Loop_fission_and_fusion, ‘loop fusion’}, ‘drag along’ @mybibcite{Abr70}, or ‘deforestation’ @mybibcite{Wad90}. In the C++ context the technique of ‘expression templates’ was pioneered in the late 90s by libraries such as Blitz++ @mybibcite{bli17}. It works by making @code{B+C} into an ‘expression object’ which holds references to its arguments and performs the sum only when its elements are looked up. The compiler removes the temporary expression objects during optimization, so that @code{A=B+C} results (in principle) in the same generated code as the complicated loop nest above.

@menu
* Rank polymorphism::         What makes arrays special.
* Drag along and beating::    The basic array optimizations.
* Why C++::                   High level, low level.
* Guidelines::                How @code{ra::} tries to do things.
* Other libraries::           Inspiration and desperation.
@end menu

@c ------------------------------------------------
@node Rank polymorphism
@section Rank polymorphism
@c ------------------------------------------------

@dfn{Rank polymorphism} is the ability to treat an array of rank @math{r} as an array of lower rank where the elements are themselves arrays.

@cindex cell
@cindex frame
For example, think of a matrix A, a 2-array with shape (n₀, n₁) where the elements A(i₀, i₁) are numbers. If we consider the subarrays A(0, ...), A(1, ...), ..., A(n₀-1, ...) as individual elements, then we have a new view of A as a 1-array of length n₀ with those rows as elements. We say that the rows A(i₀)≡A(i₀, ...) are the 1-@dfn{cells} of A, and the numbers A(i₀, i₁) are 0-cells of A. For an array of arbitrary rank @math{r} the (@math{r}-1)-cells of A are called its @dfn{items}. The prefix of the shape (n₀, n₁, ... nₙ₋₁₋ₖ) that is not taken up by the k-cell is called the k-@dfn{frame}.

An obvious way to store an array in linearly addressed memory is to place its items one after another. So we would store a 3-array as

@quotation
A: [A(0), A(1), ...]
@end quotation

and the items of A(i₀), etc. are in turn stored in the same way, so

@quotation
A: [A(0): [A(0, 0), A(0, 1) ...], ...]
@end quotation

and the same for the items of A(i₀, i₁), etc.

@quotation
A: [[A(0, 0): [A(0, 0, 0), A(0, 0, 1) ...], A(0, 1): [A(0, 1, 0), A(0, 1, 1) ...]], ...]
@end quotation

@cindex order, row-major
This way to lay out an array in memory is called @dfn{row-major order} or @dfn{C-order}, since it's the default order for built-in arrays in C (@pxref{Other libraries}). A row-major array A with shape (n₀, n₁, ... nᵣ₋₁) can be looked up like this:

@anchor{x-steps}
@quotation
A(i₀, i₁, ...) = (storage-of-A) [(((i₀n₁ + i₁)n₂ + i₂)n₃ + ...)+iᵣ₋₁] = (storage-of-A) [o + s₀i₀ + s₁i₁ +  ...]
@end quotation

@cindex step
@cindex stride
where the numbers (s₀, s₁, ...) are called the @dfn{steps}@footnote{Sometimes `strides'. Cf. @url{https://en.wikipedia.org/wiki/Dope_vector, @dfn{dope vector}}}. Note that the ‘linear’ or ‘raveled’ address [o + s₀i₀ + s₁i₁ +  ...] is an affine function of (i₀, i₁, ...). If we represent an array as a tuple

@quotation
A ≡ ((storage-of-A), o, (s₀, s₁, ...))
@end quotation

then any affine transformation of the indices can be achieved simply by modifying the numbers (o, (s₀, s₁, ...)), with no need to touch the storage. This includes common operations such as: @ref{x-transpose,transposing} axes, @ref{x-reverse,reversing} the order along an axis, most cases of @ref{Slicing,slicing}, and sometimes even reshaping or tiling the array.

A basic example is obtaining the i₀-th item of A:

@quotation
A(i₀) ≡ ((storage-of-A), o+s₀i₀, (s₁, ...))
@end quotation

Note that we can iterate over these items by simply bumping the pointer o+s₀i₀. This means that iterating over (k>0)-cells doesn't cost any  more than iterating over 0-cells (@pxref{Cell iteration}).

@c ------------------------------------------------
@node Drag along and beating
@section Drag along and beating
@c ------------------------------------------------

These two fundamental array optimizations are described in @mybibcite{Abr70}.

@dfn{Drag-along} is the process that delays evaluation of array operations. Expression templates can be seen as an implementation of drag-along. Drag-along isn't an optimization in and of itself; it simply preserves the necessary information up to the point where the expression can be executed efficiently.

@dfn{Beating} is the implementation of certain array operations on the array @ref{Containers and views,view} descriptor instead of on the array contents. For example, if @code{A} is a 1-array, one can implement @ref{x-reverse,@code{reverse(A, 0)}} by negating the @ref{x-steps,step} and moving the offset to the other end of the array, without having to move any elements. More generally, beating applies to any function-of-indices (generator) that can take the place of an array in an array expression. For instance, an expression such as @ref{x-iota,@code{1+iota(3, 0)}} can be beaten into @code{iota(3, 1)}, and this can enable further optimizations.

@c ------------------------------------------------
@node Why C++
@section Why C++
@c ------------------------------------------------

Of course the main reason is that (this being a personal project) I'm more familiar with C++ than with other languages to which the following might apply.

C++ supports the low level control that is necessary for interoperation with external libraries and languages, but still has the abstraction power to create the features we want even though the language has no native support for most of them.

@cindex APL
@cindex J
The classic array languages, APL @mybibcite{FI73} and J @mybibcite{Ric08}, have array support baked in. The same is true for other languages with array facilities such as Fortran or Octave/Matlab. Array libraries for general purpose languages usually depend heavily on C extensions. In Numpy's case @mybibcite{num17} this is both for reasons of flexibility (e.g. to obtain predictable memory layout and machine types) and of performance.

On the other extreme, an array library for C would be hampered by the limited means of abstraction in the language (no polymorphism, no metaprogramming, etc.) so the natural choice of C programmers is to resort to code generators, which eventually turn into new languages.

In C++, a library is enough.

@c ------------------------------------------------
@node Guidelines
@section Guidelines
@c ------------------------------------------------

@code{ra::} attempts to be general, consistent, and transparent.

@c @cindex J # TODO makeinfo can't handle an entry appearing more than once (it creates multiple entries in the index).
Generality is achieved by removing arbitrary restrictions and by adopting the rank extension mechanism of J. @code{ra::} supports array operations with an arbitrary number of arguments. Any of the arguments in an array expression can be read from or written to. Arrays or array expressions can be of any rank. Slicing operations work for subscripts of any rank, as in APL. You can use your own types as array elements.

Consistency is achieved by having a clear set of concepts and having the realizations of those concepts adhere to the concept as closely as possible. @code{ra::} offers a few different types of views and containers, but it should be possible to use them interchangeably whenever it makes sense. For example, it used to be the case that you couldn't create a higher rank iterator on a @code{SmallView}, even though you could do it on a @code{View}; this was a bug.

Sometimes consistency requires a choice. For example, given array views A and B, @code{A=B} copies the contents of view @code{B} into view @code{A}. To change view @code{A} instead (to treat @code{A} as a pointer) would be the default meaning of @code{A=B} for C++ types, and result in better consistency with the rest of the language, but I have decided that having consistency between views and containers (which ‘are’ their contents in a sense that views aren't) is more important.

Transparency is achieved by avoiding unnecessary abstraction. An array view consists of a pointer and a list of steps and I see no point in hiding it. Manipulating the steps directly is often useful. A container consists of storage and a view and that isn't hidden either. Some of the types have an obscure implementation but I consider that a defect. Ideally you should be able to rewrite expressions on the fly, or plug in your own traversal methods or storage handling.

That isn't to mean that you need to be in command of a lot of internal detail to be able to use the library. I hope to have provided a high level interface to most operations and a reasonably usable syntax. However, transparency is critical to achieve interoperation with external libraries and languages. When you need to, you'll be able to guarantee that an array is stored in compact columns, or that the real parts are interleaved with the imaginary parts.

@c ------------------------------------------------
@node Other libraries
@section Other array libraries
@c ------------------------------------------------

Here I try to list the C++ array libraries that I know of, or libraries that I think deserve a mention for the way they deal with arrays. It is not an extensive review, since I have only used a few of these libraries myself. Please follow the links if you want to be properly informed.

Since the C++ standard library doesn't offer a standard multidimensional array type, some libraries for specific tasks (linear algebra operations, finite elements, optimization) offer an accessory array library, which may be more or less general. Other libraries have generic array interfaces without needing to provide an array type. FFTW is a good example, maybe because it isn't C++!

@subsection Standard C++

The C++ language offers multidimensional arrays as a legacy feature from C, e.g. @code{int a[3][4]}. These decay to pointers when you do nearly anything with them, don't know their own shape or rank at runtime, and are generally too limited.

The C++ standard library also offers a number of contiguous storage containers that can be used as 1-arrays: @code{<array>}, @code{<vector>} and @code{<valarray>}. Neither supports higher ranks out of the box, but @code{<valarray>} offers array operations for 1-arrays. @code{ra::} makes use of @code{<array>} and @code{<vector>} for storage and bootstrapping.

@code{ra::} accepts built-in arrays and standard library types as array objects (@pxref{Compatibility}).

@subsection Blitz++
@cindex Blitz++
Blitz++ @mybibcite{bli17} pioneered the use of expression templates in C++. It supported higher rank arrays, as high as it was practical in C++98, but not runtime rank. It also supported small arrays with compile time shape (@code{Tiny}), and convenience features such as Fortran-order constructors and arbitrary lower bounds for the array indices (both of which @code{ra::} chooses not to support). It placed a strong emphasis on performance, with array traversal methods such as blocking, space filling curves, etc.

However, the implementation had to fight the limitations of C++98, and it offered no general rank extension mechanism.

One important difference between Blitz++ and @code{ra::} is that Blitz++'s arrays were reference counted. @code{ra::} doesn't do any memory management on its own: the default container (data-owning) types are values, and views are distinct types. You can select your own storage for the data-owning objects, including reference-counted storage (@code{ra::} declares a type using @code{std::shared_ptr}), but this is not the default.

@subsection Other C++ libraries

TODO

@subsection Other languages

TODO Maybe review other languages, at least the big ones (Fortran / APL / J / Matlab / Python-Numpy).

@c ------------------------------------------------
@node Usage
@chapter Usage
@c ------------------------------------------------

This is an extended exposition of the features of @code{ra::} and is probably best read in order. For details on specific functions or types, please @pxref{Reference}.

@menu
* Using the library::          @code{ra::} is a header-only library.
* Containers and views::       Data objects.
* Array operations::           Building and traversing expressions.
* Rank extension::             How array operands are matched.
* Cell iteration::             At any rank.
* Slicing::                    Subscripting is a special operation.
* Special objects::            Not arrays, yet arrays.
* Functions::                  Ready to go.
* The rank conjunction::       J comes to C++.
* Compatibility::              With the STL and other libraries.
* Extension::                  Using your own types and more.
* Error handling::             What to check and what to do.
@end menu

@c ------------------------------------------------
@node Using the library
@section Using @code{ra::}
@c ------------------------------------------------

@code{ra::} is a header only library with no dependencies other than the standard library, so you just need to place the @samp{ra/} folder somewhere in your include path and add @code{#include "ra/ra.hh"} at the top of your sources.

A compiler with C++20 support is required. For the current version this means at least @b{gcc 11} with @option{-std=c++20}. Some C++23 features are available with @option{-std=c++2b}. Check the top @code{README.md} for more up-to-date information.

Here is a minimal program@footnote{Examples given without context assume that one has declared @code{using std::cout;}, etc.}:

@example @c readme.cc [ma101]
@verbatim
#include "ra/ra.hh"
#include <iostream>

int main()
{
    ra::Big<char, 2> A({2, 5}, "helloworld");
    std::cout << ra::noshape << format_array(transpose<1, 0>(A), "|") << std::endl;
}
@end verbatim
@print{} h|w
   e|o
   l|r
   l|l
   d|d
@end example

The following headers are @emph{not} included by default:
@itemize
@item @code{"ra/dual.hh"}: A dual number type for simple uses of automatic differentiation.

@item @code{"ra/test.hh"}, @code{"ra/bench.hh"}: Used by the test and benchmark suites.
@end itemize

The header @code{"ra/bootstrap.hh"} can be used to configure @ref{Error handling}. You don't need to modify the header, but the configuration depends on including @code{"ra/bootstrap.hh"} before the rest of @code{ra::}. All other headers are for internal use by @code{ra::}.

@cindex container
@c ------------------------------------------------
@node Containers and views
@section Containers and views
@c ------------------------------------------------

@code{ra::} offers two kinds of data objects. The first kind, the @dfn{container}, owns its data. Creating a container uses up memory and destroying it causes that memory to be freed.

@cindex compile-time
@cindex ct
@cindex runtime
@cindex rt
There are three kinds of containers (ct: compile-time, rt: runtime): 1) ct size, 2) ct rank/rt shape, and 3) rt rank; rt rank implies rt shape. Some rt size arrays can be resized but rt rank arrays cannot normally have their rank changed. Instead, you create a new container or view with the rank you want.

For example:

@example
@verbatim
{
    ra::Small<double, 2, 3> a(0.);     // a ct size 2x3 array
    ra::Big<double, 2> b({2, 3}, 0.);  // a rt size 2x3 array
    ra::Big<double> c({2, 3}, 0.);     // a rt rank 2x3 array
    // a, b, c destroyed at end of scope
}
@end verbatim
@end example

Using the right kind of container can result in better performance. Ct shapes do not need to be stored in memory, which matters when you have many small arrays. Ct shape or ct rank arrays are also safer to use; sometimes @code{ra::} will be able to detect errors in the shapes or ranks of array operands at compile time, if the appropriate types are used.

Container constructors come in two forms. The first form takes a single argument which is copied into the new container. This argument provides shape information if the container type requires it.@footnote{The brace-list constructors of rank 2 and higher aren't supported on types of rt rank, because in the C++ grammar, a nested initializer list doesn't always define a rank unambiguously.}

@c [ma111]
@example
@verbatim
using ra::Small, ra::Big;
Small<int, 2, 2> a = {{1, 2}, {3, 4}};  // explicit contents
Big<int, 2> a1 = {{1, 2}, {3, 4}};      // explicit contents
Small<int, 2, 2> a2 = {{1, 2}};         // error: bad shape
Small<int, 2, 2> b = 7;                 // 7 is copied into b
Small<int, 2, 2> c = a;                 // the contents of a are copied into c
Big<int> d = a;                         // d takes the shape of a and a is copied into d
Big<int> e = 0;                         // e is a 0-array with one element f()==0.
@end verbatim
@end example

The second form takes two arguments, one giving the shape, the second the contents.

@cindex @code{none}
@cindex uninitialized container
@example
@verbatim
ra::Big<double, 2> a({2, 3}, 1.);     // a has shape [2 3], filled with 1.
ra::Big<double> b({2, 3}, ra::none);  // b has shape [2 3], default initialized
ra::Big<double> c({2, 3}, a);         // c has shape [2 3], a is copied into c
@end verbatim
@end example

The last example may result in an error if the shape of @code{a} and (2,@w{ }3) don't match. Here the shape of @code{1.} [which is ()] matches (2,@w{ }3) by a mechanism of rank extension (@pxref{Rank extension}). The special value @code{ra::none} can be used to request @url{https://en.cppreference.com/w/cpp/language/default_initialization, default initialization} of the container's elements.

The shape argument can have rank 0 only for rank 1 arrays.

@cindex @code{none}
@cindex uninitialized container
@example
@verbatim
ra::Big<int> c(3, 0);      // ok {0, 0, 0}, same as ra::Big<int> c({3}, 0)
ra::Big<int, 1> c(3, 0);   // ok {0, 0, 0}, same as ra::Big<int, 1> c({3}, 0)
ra::Big<int, 2> c({3}, 0); // error: bad length for shape
ra::Big<int, 2> c(3, 0);   // error: bad length for shape
@end verbatim
@end example

When the content argument is a pointer or a 1D brace list, it's handled especially, not for shape@footnote{You can still use pointers or @code{std::initializer_list}s for shape by wrapping them in the functions @code{ptr} or @code{vector}, respectively.}, but only as the (row-major) ravel of the content. The pointer constructor is unsafe —use at your own risk!@footnote{The brace-list constructors aren't rank extending, because giving the ravel is incompatible with rank extension. They are shape-strict —you must give every element.}

@cindex order, column-major
@example
@verbatim
Small<int, 2, 2> aa = {1, 2, 3, 4}; // ravel of the content

ra::Big<double, 2> a({2, 3}, {1, 2, 3, 4, 5, 6}); // same as a = {{1, 2, 3}, {4, 5, 6}}
@end verbatim
@end example

@c [ma112]
@example
@verbatim
double bx[6] = {1, 2, 3, 4, 5, 6}
ra::Big<double, 2> b({3, 2}, bx); // {{1, 2}, {3, 4}, {5, 6}}

double cx[4] = {1, 2, 3, 4}
ra::Big<double, 2> c({3, 2}, cx); // *** WHO NOSE ***
@end verbatim
@end example

@c [ma114]
@example
@verbatim
using lens = mp::int_list<2, 3>;
using steps = mp::int_list<1, 2>;
ra::SmallArray<double, lens, steps> a {{1, 2, 3}, {4, 5, 6}}; // stored column-major: 1 4 2 5 3 6
@end verbatim
@end example

These produce compile time errors:

@example
@verbatim
Big<int, 2> b = {1, 2, 3, 4};           // error: shape cannot be deduced from ravel
Small<int, 2, 2> b = {1, 2, 3, 4 5};    // error: bad size
Small<int, 2, 2> b = {1, 2, 3};         // error: bad size
@end verbatim
@end example

@anchor{x-scalar-char-star}
Sometimes the pointer constructor gets in the way (see @ref{x-scalar,@code{scalar}}): @c [ma102]

@example
@verbatim
ra::Big<char const *, 1> A({3}, "hello"); // error: try to convert char to char const *
ra::Big<char const *, 1> A({3}, ra::scalar("hello")); // ok, "hello" is a single item
cout << ra::noshape << format_array(A, "|") << endl;
@end verbatim
@print{} hello|hello|hello
@end example

@cindex view
A @dfn{view} is similar to a container in that it points to actual data in memory. However, the view doesn't own that data and destroying the view won't affect it. For example:

@example
@verbatim
ra::Big<double> c({2, 3}, 0.);     // a rt rank 2x3 array
{
    auto c1 = c(1);                // the second row of array c
    // c1 is destroyed here
}
cout << c(1, 1) << endl;           // ok
@end verbatim
@end example

The data accessed through a view is the data of the ‘root’ container, so modifying the former will be reflected in the latter.

@example
@verbatim
ra::Big<double> c({2, 3}, 0.);
auto c1 = c(1);
c1(2) = 9.;                        // c(1, 2) = 9.
@end verbatim
@end example

Just as for containers, there are separate types of views depending on whether the shape is known at compile time, the rank is known at compile time but the shape is not, or neither the shape nor the rank are known at compile time. @code{ra::} has functions to create the most common kinds of views:

@example
@verbatim
ra::Big<double> c {{1, 2, 3}, {4, 5, 6}};
auto ct = transpose<1, 0>(c); // {{1, 4}, {2, 5}, {3, 6}}
auto cr = reverse(c, 0); // {{4, 5, 6}, {1, 2, 3}}
@end verbatim
@end example

However, views can point to anywhere in memory and that memory doesn't have to belong to an @code{ra::} container. For example:

@example
@verbatim
int raw[6] = {1, 2, 3, 4, 5, 6};
ra::View<int> v1({{2, 3}, {3, 1}}, raw); // view with shape [2, 3] steps [3, 1]
ra::View<int> v2({2, 3}, raw);           // same, default C (row-major) steps
@end verbatim
@end example

Containers can be treated as views of the same kind (rt or ct) . If you declare a function

@example
@verbatim
void f(ra::View<int, 3> & v);
@end verbatim
@end example

you may pass it an object of type @code{ra::Big<int, 3>}.


@c ------------------------------------------------
@node Array operations
@section Array operations
@c ------------------------------------------------

To apply an operation to each element of an array, use the function @code{for_each}. The array is traversed in an order that is decided by the library.

@example
@verbatim
ra::Small<double, 2, 3> a = {{1, 2, 3}, {4, 5, 6}};
double s = 0.;
for_each([&s](auto && a) { s+=a; }, a);
@end verbatim
@result{} s = 21.
@end example

To construct an array expression but stop short of traversing it, use the function @code{map}. The expression will be traversed when it is assigned to a view, printed out, etc.

@example
@verbatim
using T = ra::Small<double, 2, 2>;
T a = {{1, 2}, {3, 4}};
T b = {{10, 20}, {30, 40}};
T c = map([](auto && a, auto && b) { return a+b; }, a, b); // (1)
@end verbatim
@result{} c = @{@{11, 22@}, @{33, 44@}@}
@end example

Expressions may take any number of arguments and be nested arbitrarily.

@example
@verbatim
T d = 0;
for_each([](auto && a, auto && b, auto && d) { d = a+b; },
         a, b, d); // same as (1)
for_each([](auto && ab, auto && d) { d = ab; },
         map([](auto && a, auto && b) { return a+b; },
             a, b),
         d); // same as (1)
@end verbatim
@end example

The operator of an expression may return a reference and you may assign to an expression in that case. @code{ra::} will complain if the expression is somehow not assignable.

@example
@verbatim
T d = 0;
map([](auto & d) -> decltype(auto) { return d; }, d) // just pass d along
  = map([](auto && a, auto && b) { return a+b; }, a, b); // same as (1)
@end verbatim
@end example

@code{ra::} defines many shortcuts for common array operations. You can of course just do:

@example
@verbatim
T c = a+b; // same as (1)
@end verbatim
@end example

@c ------------------------------------------------
@node Rank extension
@section Rank extension
@c ------------------------------------------------

Rank extension is the mechanism that allows @code{R+S} to be defined even when @code{R}, @code{S} may have different ranks. The idea is an interpolation of the following basic cases.

Suppose first that @code{R} and @code{S} have the same rank. We require that the shapes be the same. Then the shape of @code{R+S} will be the same as the shape of either @code{R} or @code{S} and the elements of @code{R+S} will be

@quotation
@code{(R+S)(i₀ i₁ ... i₍ᵣ₋₁₎) = R(i₀ i₁ ... i₍ᵣ₋₁₎) + S(i₀ i₁ ... i₍ᵣ₋₁₎)}
@end quotation

where @code{r} is the rank of @code{R}.

Now suppose that @code{S} has rank 0. The shape of @code{R+S} is the same as the shape of @code{R} and the elements of @code{R+S} will be

@quotation
@code{(R+S)(i₀ i₁ ... i₍ᵣ₋₁₎) = R(i₀ i₁ ... i₍ᵣ₋₁₎) + S()}.
@end quotation

The two rules above are supported by all primitive array languages, e.g. Matlab @mybibcite{Mat}. But suppose that @code{S} has rank @code{s}, where @code{0<s<r}. Looking at the expressions above, it seems natural to define @code{R+S} by

@quotation
@code{(R+S)(i₀ i₁ ... i₍ₛ₋₁₎ ... i₍ᵣ₋₁₎) = R(i₀ i₁ ... i₍ₛ₋₁₎ ... i₍ᵣ₋₁₎) + S(i₀ i₁ ... i₍ₛ₋₁₎)}.
@end quotation

That is, after we run out of indices in @code{S}, we simply repeat the elements. We have aligned the shapes so:

@quotation
@verbatim
[n₀ n₁ ... n₍ₛ₋₁₎ ... n₍ᵣ₋₁₎]
[n₀ n₁ ... n₍ₛ₋₁₎]
@end verbatim
@end quotation

@cindex shape agreement, prefix
@cindex shape agreement, suffix
@c @cindex J
@cindex Numpy
This rank extension rule is used by the J language @mybibcite{J S} and is known as @dfn{prefix agreement}. The opposite rule of @dfn{suffix agreement} is used, for example, in Numpy @mybibcite{num17}@footnote{Prefix agreement is chosen for @code{ra::} because of the availability of a @ref{The rank conjunction,rank conjunction} @mybibcite{Ber87} and @ref{Cell iteration, cell iterators of arbitrary rank}. This allows rank extension to be performed at multiple axes of an array expression.}.

As you can verify, the prefix agreement rule is distributive. Therefore it can be applied to nested expressions or to expressions with any number of arguments. It is applied systematically throughout @code{ra::}, even in assignments. For example,

@example
@verbatim
ra::Small<int, 3> x {3, 5, 9};
ra::Small<int, 3, 2> a = x; // assign x(i) to each a(i, j)
@end verbatim
@result{} a = @{@{3, 3@}, @{5, 5@}, @{9, 9@}@}
@end example

@example
@verbatim
ra::Small<int, 3> x(0.);
ra::Small<int, 3, 2> a = {{1, 2}, {3, 4}, {5, 6}};
x += a; // sum the rows of a
@end verbatim
@result{} x = @{3, 7, 11@}
@end example

@example
@verbatim
ra::Big<double, 3> a({5, 3, 3}, ra::_0);
ra::Big<double, 1> b({5}, 0.);
b += transpose<0, 1, 1>(a); // b(i) = ∑ⱼ a(i, j, j)
@end verbatim
@result{} b = @{0, 3, 6, 9, 12@}
@end example

@cindex Numpy
@cindex broadcasting, singleton, newaxis
An weakness of prefix agreement is that the axes you want to match aren't always the prefix axes. Other array systems offer a feature similar to rank extension called ‘broadcasting’ that is a bit more flexible. For example, in the way it's implemented in Numpy @mybibcite{num17}, an array of shape [A B 1 D] will match an array of shape [A B C D]. The process of broadcasting consists in inserting so-called ‘singleton dimensions’ (axes with length one) to align the axes that one wishes to match. You can think of prefix agreement as a particular case of broadcasting where the singleton dimensions are added to the end of the shorter shapes automatically.

A drawback of singleton broadcasting is that it muddles the distinction between a scalar and a vector of length 1. Sometimes, an axis of length 1 is no more than that, and if 2≠3 is a size mismatch, it isn't obvious why 1≠2 shouldn't be. To avoid this problem, @code{ra::} supports broadcasting with undefined length axes (see @ref{x-insert,@code{insert}}).

@example
@verbatim
ra::Big<double, 3> a({5, 3}, ra::_0);
ra::Big<double, 1> b({3}, 0.);
ra::Big<double, 3> c({1, 3}, ra::_0);

// b(?, i) += a(j, i) → b(i) = ∑ⱼ a(j, i) (sum columns)
b(ra::insert<1>) += a;

c = a; // 1 ≠ 5, still an agreement error
@end verbatim
@end example

Still another way to align array axes is provided by the @ref{The rank conjunction,rank conjunction}.

Even with axis insertion, it is still necessary that the axes one wishes to match are in the same order in all the arguments.
@ref{x-transpose,Transposing} the axes before extension is a possible workaround.

@c ------------------------------------------------
@node Cell iteration
@section Cell iteration
@c ------------------------------------------------

@code{map} and @code{for_each} apply their operators to each element of their arguments; in other words, to the 0-cells of the arguments. But it is possible to specify directly the rank of the cells that one iterates over:

@example
@verbatim
ra::Big<double, 3> a({5, 4, 3}, ra::_0);
for_each([](auto && b) { /* b has shape (5 4 3) */ }, iter<3>(a));
for_each([](auto && b) { /* b has shape (4 3) */ }, iter<2>(a));
for_each([](auto && b) { /* b has shape (3) */ }, iter<1>(a));
for_each([](auto && b) { /* b has shape () */ }, iter<0>(a)); // elements
for_each([](auto && b) { /* b has shape () */ }, a); // same as iter<0>(a); default
@end verbatim
@end example

One may specify the @emph{frame} rank instead:

@example
@verbatim
for_each([](auto && b) { /* b has shape () */ }, iter<-3>(a)); // same as iter<0>(a)
for_each([](auto && b) { /* b has shape (3) */ }, iter<-2>(a)); // same as iter<1>(a)
for_each([](auto && b) { /* b has shape (4 3) */ }, iter<-1>(a)); // same as iter<2>(a)
@end verbatim
@end example

In this way it is possible to match shapes in various ways. Compare

@example
@verbatim
ra::Big<double, 2> a = {{1, 2, 3}, {4, 5, 6}};
ra::Big<double, 1> b = {10, 20};
ra::Big<double, 2> c = a * b; // multiply (each item of a) by (each item of b)
@end verbatim
@result{} a = @{@{10, 20, 30@}, @{80, 100, 120@}@}
@end example

with

@example @c [ma105]
@verbatim
ra::Big<double, 2> a = {{1, 2, 3}, {4, 5, 6}};
ra::Big<double, 1> b = {10, 20, 30};
ra::Big<double, 2> c({2, 3}, 0.);
iter<1>(c) = iter<1>(a) * iter<1>(b); // multiply (each item of a) by (b)
@end verbatim
@result{} a = @{@{10, 40, 90@}, @{40, 100, 180@}@}
@end example

Note that in this case we cannot construct @code{c} directly from @code{iter<1>(a) * iter<1>(b)}, since the constructor for @code{ra::Big} matches its argument using (the equivalent of) @code{iter<0>(*this)}. See @ref{x-iter,@code{iter}} for more examples.

Cell iteration is appropriate when the operations take naturally operands of rank > 0; for instance, the operation ‘determinant of a matrix’ is naturally of rank 2. When the operation is of rank 0, such as @code{*} above, there may be faster ways to rearrange shapes for matching (@pxref{The rank conjunction}).

FIXME More examples.

@c ------------------------------------------------
@node Slicing
@section Slicing
@c ------------------------------------------------

Slicing is an array extension of the subscripting operation. However, tradition and convenience have given it a special status in most array languages, together with some peculiar semantics that @code{ra::} supports.

The form of the scripting operator @code{A(i₀, i₁, ...)} makes it plain that @code{A} is a function of @code{rank(A)} integer arguments@footnote{The multi-argument square bracket form @code{A[i₀, i₁, ...]} is supported under C++23 compilers (e.g. gcc ≥ 12 with @code{-std=c++2b}), with the same meaning as @code{A(i₀, i₁, ...)}. Under C++20 only a single-argument square bracket form @code{A[i₀]} is available.}. An array extension is immediately available through @code{map}. For example:

@example
@verbatim
ra::Big<double, 1> a = {1., 2., 3., 4.};
ra::Big<int, 1> i = {1, 3};
map(a, i) = 77.;
@end verbatim
@result{} a = @{1., 77., 3, 77.@}
@end example

Just as with any use of @code{map}, array arguments are subject to the prefix agreement rule.

@example
@verbatim
ra::Big<double, 2> a({2, 2}, {1., 2., 3., 4.});
ra::Big<int, 1> i = {1, 0};
ra::Big<double, 1> b = map(a, i, 0);
@end verbatim
@result{} b = @{3., 1.@} // @{a(1, 0), a(0, 0)@}
@end example

@example
@verbatim
ra::Big<int, 1> j = {0, 1};
b = map(a, i, j);
@end verbatim
@result{} b = @{3., 2.@} // @{a(1, 0), a(0, 1)@}
@end example

The latter is a form of sparse subscripting.

Most array operations (e.g. @code{+}) are defined through @code{map} in this way. For example, @code{A+B+C} is defined as @code{map(+, A, B, C)} (or the equivalent @code{map(+, map(+, A, B), C)}). Not so for the subscripting operation:

@example
@verbatim
ra::Big<double, 2> A {{1., 2.}, {3., 4.}};
ra::Big<int, 1> i = {1, 0};
ra::Big<int, 1> j = {0, 1};
// {{A(i₀, j₀), A(i₀, j₁)}, {A(i₁, j₀), A(i₁, j₁)}}
ra::Big<double, 2> b = A(i, j);
@end verbatim
@result{} b = @{@{3., 4.@}, @{1., 2.@}@}
@end example

@anchor{x-subscript-outer-product}
@code{A(i, j, ...)} is defined as the @emph{outer product} of the indices @code{(i, j, ...)} with operator @code{A}, because this operation sees much more use in practice than @code{map(A, i, j ...)}.

@cindex elision, index
You may give fewer subscripts than the rank of the array. The full extent is assumed for the missing subscripts (cf @ref{x-all,@code{all}} below):

@example
@verbatim
ra::Big<int, 3> a({2, 2, 2}, {1, 2, 3, 4, 5, 6, 7, 8});
auto a0 = a(0); // same as a(0, ra::all, ra::all)
auto a10 = a(1, 0); // same as a(1, 0, ra::all)
@end verbatim
@result{} a0 = @{@{1, 2@}, @{3, 4@}@}
@result{} a10 = @{5, 6@}
@end example

This supports the notion (@pxref{Rank polymorphism}) that a 3-array is also an 2-array where the elements are 1-arrays themselves, or a 1-array where the elements are 2-arrays. This important property is directly related to the mechanism of rank extension (@pxref{Rank extension}).

Besides, when the subscripts @code{i, j, ...} are scalars or integer sequences of the form @code{(o, o+s, ..., o+s*(n-1))} (@dfn{linear ranges}), the subscripting can be performed inmediately at constant cost, and without needing to construct an expression object. This optimization is called @ref{Drag along and beating,@dfn{beating}}.

@code{ra::} isn't smart enough to know when an arbitrary expression might be a linear range, so the following special objects are provided:

@anchor{x-iota}
@deffn @w{Special object} iota count [start:0 [step:1]]
Create a linear range @code{start, start+step, ... start+step*(count-1)}.

This can used anywhere an array expression is expected.

@example
@verbatim
ra::Big<int, 1> a = ra::iota(4, 3 -2);
@end verbatim
@result{} a = @{3, 1, -1, -3@}
@end example

Here, @code{b} and @code{c} are @code{View}s (@pxref{Containers and views}).
@example
@verbatim
ra::Big<int, 1> a = {1, 2, 3, 4, 5, 6};
auto b = a(iota(3));
auto c = a(iota(3, 3));
@end verbatim
@result{} a = @{1, 2, 3@}
@result{} a = @{4, 5, 6@}
@end example

@cindex TensorIndex
@code{iota()} by itself is an expression of rank 1 and undefined length. It must be used with other terms whose lengths are defined, so that the overall shape of the array expression can be determined. In general, @code{iota<n>()} is an array expression of rank @code{n}+1 that represents the @code{n}-th index of an array expression. This is similar to Blitz++'s @code{TensorIndex}.

@code{ra::} offers the shortcut @code{ra::_0} for @code{ra::iota<0>()}, etc.

@example
@verbatim
ra::Big<int, 1> v = {1, 2, 3};
cout << (v - ra::_0) << endl; // { 1-0, 2-1, 3-2 }
// cout << (ra::_0) << endl; // error: undefined length
// cout << (v - ra::_1) << endl; // error: undefined length on axis 1
ra::Big<int, 2> a({3, 2}, 0);
cout << (a + ra::_0 - ra::_1) << endl; // {{0, -1, -2}, {1, 0, -1}, {2, 1, 0}}
@end verbatim
@end example

When undefined length @code{iota()} is used as a subscript by itself, the result isn't a @code{View}. This allows @code{view(iota())} to match with expressions of different lengths, as in the following example.
@example
@verbatim
ra::Big<int, 1> a = {1, 2, 3, 4, 5, 6};
ra::Big<int, 1> b = {1, 2, 3};
cout << (b + a(iota())) << endl; // a(iota()) is not a View
@end verbatim
@print{} 3
2 4 6
@end example

Note the difference between
@itemize
@item @code{ra::iota<3>()} —
an expression of rank 4 and undefined length, representing a linear sequence over the tensor index of axis 3
@item @code{ra::iota(3)} ≡ @code{ra::iota<0>(3)} —
an expression of rank 1, representing the sequence @code{0, 1, 2}.
@end itemize

@end deffn

@anchor{x-all}
@deffn @w{Special object} all
Create a linear range @code{0, 1, ... (nᵢ-1)} when used as a subscript at the @var{i}-th place of a subscripting expression. This might not be the @var{i}-th argument; see @ref{x-insert,@code{insert}}, @ref{x-dots,@code{dots}}.

This object cannot stand alone as an array expression. All the examples below result in @code{View} objects:

@example
@verbatim
ra::Big<int, 2> a({3, 2}, {1, 2, 3, 4, 5, 6});
auto b = a(ra::all, ra::all); // (1) a view of the whole of a
auto c = a(iota(3), iota(2)); // same as (1)
auto d = a(iota(3), ra::all); // same as (1)
auto e = a(ra:all, iota(2)); // same as (1)
auto f = a(0, ra::all); // first row of a
auto g = a(ra::all, 1); // second column of a
auto g = a(ra::all, ra::dots<0>, 1); // same
@end verbatim
@end example

@code{all} is a special case (@code{dots<1>}) of the more general object @code{dots}.

@end deffn

@anchor{x-dots}
@deffn @w{Special object} dots<n>
Equivalent to as many instances of @code{ra::all} as indicated by @code{n}, which must not be negative. Each instance takes the place of one argument to the subscripting operation.

If @var{n} is defaulted (@code{dots<>}), all available places will be used; this can only be done once in a given subscript list.

This object cannot stand alone as an array expression. All the examples below result in @code{View} objects:

@example
@verbatim
ra::Big<int, 3> a({3, 2, 4}, ...);
auto h = a(); // all of a

auto b = a(ra::all, ra::all, ra::all); // (1) all of a
auto c = a(ra::dots<3>); // same as (1)
auto d = a(ra::all, ra::dots<2>); // same as (1)
auto e = a(ra::dots<2>, ra::all); // same as (1)
auto f = a(ra::dots<>); // same as (1)

auto j0 = a(0, ra::dots<2>); // first page of a
auto j1 = a(0); // same
auto j2 = a(0, ra::dots<>); // same

auto k0 = a(ra::all, 0); // first row of a
auto k1 = a(ra::all, 0, ra::all); // same
auto k2 = a(ra::all, 0, ra::dots<>); // same
auto k3 = a(ra::dots<>, 0, ra::all); // same
// auto k = a(ra::dots<>, 0, ra::dots<>); // error

auto l0 = a(ra::all, ra::all, 0); // first column of a
auto l1 = a(ra::dots<2>, 0); // same
auto l2 = a(ra::dots<>, 0); // same
@end verbatim
@end example

This is useful when writing rank-generic code, see @code{examples/maxwell.cc} in the distribution for an example.

@end deffn

The following special objects aren't related to linear ranges, but they are meant to be used in a subscript context. Using them in other contexts will result in a compile time error.

@cindex @code{len}
@cindex @code{end}, Octave/Matlab
@anchor{x-len}
@deffn @w{Special object} len
Represents the length of the @var{i}-th axis of a subscripted expression, when used at the @var{i}-th place of a subscripting expression.

This works like @code{end} in Octave/Matlab, but note that @code{ra::} indices begin at 0, so the last element of a vector @code{a} is @code{a(ra::len-1)}.

@example
@verbatim
ra::Big<int, 2> a({10, 10}, 100 + ra::_0 - ra::_1);
auto a0 = a(ra::len-1); // last row of a; ra::len is a.len(0)
auto a1 = a(ra::all, ra::len-1); // last column a; ra::len is a.len(1)
auto a2 = a(ra::len-1, ra::len-1); // last element of last row; the first ra::len is a.len(0) and the second one is a.len(1)
auto a3 = a(ra::all, ra::iota(2, ra::len-2)); // last two columns of a
auto a4 = a(ra::iota(ra::len/2, 1, 2)); // odd rows of a
a(ra::len - std::array {1, 3, 4}) = 0; // set to 0 the 1st, 3rd and 4th rows of a, counting from the end
@end verbatim
@end example

@example
@verbatim
ra::Big<int, 3> b({2, 3, 4}, ...);
auto b0 = b(ra::dots<2>, ra::len-1); // ra::len is a.len(2)
auto b1 = b(ra::insert<1>, ra::len-1); // ra::len is a.len(0)
@end verbatim
@end example

@end deffn

@cindex @code{insert}
@anchor{x-insert}
@deffn @w{Special object} insert<n>
Inserts @code{n} new axes at the subscript position. @code{n} must not be negative.

The new axes have step 0 and undefined length, so they will match any length on those axes by repeating items. @code{insert} objects cannot stand alone as an array expression. The examples below result in @code{View} objects:

@example
@verbatim
auto h = a(insert<0>); // same as (1)
auto k = a(insert<1>); // shape [undefined, 3, 2]
@end verbatim
@end example

@cindex broadcasting, singleton, Numpy
@code{insert<n>} main use is to prepare arguments for broadcasting. In other array systems (e.g. Numpy) broadcasting is done with singleton dimensions, that is, dimensions of length one match dimensions of any length. In @code{ra::} singleton dimensions aren't special, so broadcasting requires the use of @code{insert}. For example: @c [ma115]

@example
@verbatim
ra::Big<int, 1> x = {1, 10};
// match shapes [2, U, U] with [U, 3, 2] to produce [2, 3, 2]
cout << x(ra::all, ra::insert<2>) * a(insert<1>) << endl;
@end verbatim
@print{} 2 3 2
1 2
3 4
5 6

10 20
30 40
50 60
@end example

@end deffn

We were speaking earlier of the outer product of the subscripts with operator @code{A}. Here's a way to perform the actual outer product (with operator @code{*}) of two @code{Views}, through broadcasting. All three lines are equivalent. See @ref{x-from,@code{from}} for a more general way to compute outer products.
@example
@verbatim
cout << (A(ra::dots<A.rank()>, ra::insert<B.rank()>) * B(ra::insert<A.rank()>, ra::dots<B.rank()>)) << endl;
cout << (A(ra::dots<>, ra::insert<B.rank()>) * B(ra::insert<A.rank()>, ra::dots<>)) << endl; // default dots<>
cout << (A * B(ra::insert<A.rank()>)) << endl; // index elision + prefix matching
@end verbatim
@end example

@subsection Subscripting and rank-0 views

@cindex view, rank 0
@cindex rank, runtime
@cindex rank, compile-time
When the result of the subscripting operation would have rank 0, the type returned is the type of the view @emph{element} and not a rank-0 view as long as the rank of the result can be determined at compile time. When that's not possible (for instance, the subscripted view has rt rank) then a rank-0 view is returned instead. An automatic conversion is defined for rank-0 views, but manual conversion may be needed in some contexts.

@example
@verbatim
using T = std::complex<double>;
int f(T &);
Big<T, 2> a({2, 3}, 0); // ct rank
Big<T> b({2, 3}, 0); // rt rank

cout << a(0, 0).real_part() << endl; // ok, a(0, 0) returns complex &
// cout << b(0, 0).real_part() << endl; // error, View<T> has no member real_part
cout << ((T &)(b(0, 0))).real_part() << endl; // ok, manual conversion to T &

cout << f(b(0, 0)) << endl; // ok, automatic conversion from View<T> to T &
// cout << f(a(0)) << endl; // compile time error, conversion failed since ct rank of a(0) is not 0
// cout << f(b(0)) << endl; // runtime error, conversion failed since rt rank of b(0) is not 0
@end verbatim
@end example


@c ------------------------------------------------
@node Functions
@section Functions
@c ------------------------------------------------

You don't need to use @ref{Array operations,@code{map}} every time you want to do something with arrays in @code{ra::}. A number of array functions are already defined.

@anchor{x-scalar-ops}
@subsection Standard scalar operations

@code{ra::} defines array extensions for @code{+}, @code{-} (both unary and binary), @code{*}, @code{/}, @code{!}, @code{&&}, @code{||}@footnote{@code{&&}, @code{||} are short-circuiting as array operations; the elements of the second operand won't be evaluated if the elements of the first one evaluate to @code{false} or @code{true}, respectively.
Note that if both operands are of rank 0 and at least one of them is an @code{ra::} object, they is no way to preserve the behavior of @code{&&} and @code{||} with built in types and avoid evaluating both, since the overloaded operators @url{http://en.cppreference.com/w/cpp/language/operators, are normal functions}.}, @code{>}, @code{<}, @code{>=}, @code{<=}, @code{<=>}, @code{==}, @code{!=}, @code{pow}, @code{sqr}, @code{abs}, @code{cos}, @code{sin}, @code{exp}, @code{expm1}, @code{sqrt}, @code{log}, @code{log1p}, @code{log10}, @code{isfinite}, @code{isnan}, @code{isinf}, @code{max}, @code{min}, @code{asin}, @code{acos}, @code{atan}, @code{atan2}, @code{cosh}, @code{sinh}, @code{tanh}, and @code{lerp}.
Extending other scalar operations is straightforward; see @ref{x-new-array-operations,New array operations}. @code{ra::} also defines (and extends) the non-standard functions @code{odd}, @ref{x-sqr,@code{sqr}}, @ref{x-sqrm,@code{sqrm}}, @ref{x-conj,@code{conj}}, @ref{x-rel-error,@code{rel_error}}, and @ref{x-xI,@code{xI}}.

For example:
@example @c [ma110]
@verbatim
cout << exp(ra::Small<double, 3> {4, 5, 6}) << endl;
@end verbatim
  @print{} 54.5982 148.413 403.429
@end example

@subsection Conditional operations

@ref{x-map,@code{map}} evaluates all of its arguments before passing them along to its operator. This isn't always what you want. The simplest example is @code{where(condition, iftrue, iffalse)}, which returns an expression that will evaluate @code{iftrue} when @code{condition} is true and @code{iffalse} otherwise.

@example
@verbatim
ra::Big<double> x ...
ra::Big<double> y = where(x>0, expensive_expr_1(x), expensive_expr_2(x));
@end verbatim
@end example

Here @code{expensive_expr_1} and @code{expensive_expr_2} are array expressions. So the computation of the other arm would be wasted if one were to do instead

@example
@verbatim
ra::Big<double> y = map([](auto && w, auto && t, auto && f) -> decltype(auto) { return w ? t : f; }
                        x>0, expensive_expr_1(x), expensive_function_2(x));
@end verbatim
@end example

If the expressions have side effects, then @code{map} won't even give the right result.

@c [ma109]
@example
@verbatim
ra::Big<int, 1> o = {};
ra::Big<int, 1> e = {};
ra::Big<int, 1> n = {1, 2, 7, 9, 12};
ply(where(odd(n), map([&o](auto && x) { o.push_back(x); }, n), map([&e](auto && x) { e.push_back(x); }, n)));
cout << "o: " << ra::noshape << o << ", e: " << ra::noshape << e << endl;
@end verbatim
@print{} o: 1 7 9, e: 2 12
@end example

FIXME Artificial example.
FIXME Do we want to expose ply(); this is the only example in the manual that uses it.

When the choice is between more than two expressions, there's @ref{x-pick,@code{pick}}, which operates similarly, but accepts an integer instead of a boolean selector.

@subsection Special operations

Some operations are essentially scalar operations, but require special syntax and would need a lambda wrapper to be used with @code{map}. @code{ra::} comes with a few of these already defined.

FIXME

@subsection Elementwise reductions

@code{ra::} defines the whole-array one-argument reductions @code{any}, @code{every}, @code{amax}, @code{amin}, @code{sum}, @code{prod} and the two-argument reductions @code{dot} and @code{cdot}. Note that @code{max} and @code{min} are two-argument scalar operations with array extensions, while @code{amax} and @code{amin} are reductions. @code{any} and @code{every} are short-circuiting.

You can define reductions the same way @code{ra::} does:

@example
@verbatim
template <class A>
inline auto op_reduce(A && a)
{
    T c = op_default;
    for_each([&c](auto && a) { c = op(c, a); }, a);
    return c;
}
@end verbatim
@end example

Often enough you need to reduce over particular axes. This is possible by combining assignment operators with the @ref{Rank extension,rank extension} mechanism, or using the @ref{The rank conjunction,rank conjunction}, or iterating over @ref{Cell iteration, cells of higher rank}. For example:

@example
@verbatim
    ra::Big<double, 2> a({m, n}, ...);

    ra::Big<double, 1> sum_rows({n}, 0.);
    iter<1>(sum_rows) += iter<1>(a);
    // for_each(ra::wrank<1, 1>([](auto & c, auto && a) { c += a; }), sum_rows, a) // alternative
    // sum_rows += transpose<1, 0>(a); // another

    ra::Big<double, 1> sum_cols({m}, 0.);
    sum_cols += a;
@end verbatim
@end example

FIXME example with assignment op

A few common operations of this type are already packaged in @code{ra::}.

@subsection Special reductions

@code{ra::} defines the following special reductions.

FIXME

@subsection Shortcut reductions

Some reductions do not need to traverse the whole array if a certain condition is encountered early. The most obvious ones are the reductions of @code{&&} and @code{||}, which @code{ra::} defines as @code{every} and @code{any}.

FIXME

These operations are defined on top of another function @code{early}.

FIXME early

The following is often useful.

FIXME lexicographical compare etc.

@c ------------------------------------------------
@node The rank conjunction
@section The rank conjunction
@c ------------------------------------------------

We have seen in @ref{Cell iteration} that it is possible to treat an r-array as an array of lower rank with subarrays as its elements. With the @ref{x-iter,@code{iter<cell rank>}} construction, this ‘exploding’ is performed (notionally) on the argument; the operation of the array expression is applied blindly to these cells, whatever they turn out to be.

@example
@verbatim
for_each(my_sort, iter<1>(A)); // (in ra::) my_sort is a regular function, cell rank must be given
for_each(my_sort, iter<0>(A)); // (in ra::) error, bad cell rank
@end verbatim
@end example

@c @cindex J
The array language J has instead the concept of @dfn{verb rank}. Every function (or @dfn{verb}) has an associated cell rank for each of its arguments. Therefore @code{iter<cell rank>} is not needed.

@example
@verbatim
for_each(sort_rows, A); // (not in ra::) will iterate over 1-cells of A, sort_rows knows
@end verbatim
@end example

@c @cindex J
@code{ra::} doesn't have ‘verb ranks’ yet. In practice one can think of @code{ra::}'s operations as having a verb rank of 0. However, @code{ra::} supports a limited form of J's @dfn{rank conjunction} with the function @ref{x-wrank,@code{wrank}}.

@c @cindex J
This is an operator that takes one verb (such operators are known as @dfn{adverbs} in J) and produces another verb with different ranks. These ranks are used for rank extension through prefix agreement, but then the original verb is used on the cells that result. The rank conjunction can be nested, and this allows repeated rank extension before the innermost operation is applied.

A standard example is ‘outer product’.

@example
@verbatim
ra::Big<int, 1> a = {1, 2, 3};
ra::Big<int, 1> b = {40, 50};
ra::Big<int, 2> axb = map(ra::wrank<0, 1>([](auto && a, auto && b) { return a*b; }),
                            a, b)
@end verbatim
@result{} axb = @{@{40, 80, 120@}, @{50, 100, 150@}@}
@end example

It works like this. The verb @code{ra::wrank<0, 1>([](auto && a, auto && b) @{ return a*b; @})} has verb ranks (0, 1), so the 0-cells of @code{a} are paired with the 1-cells of @code{b}. In this case @code{b} has a single 1-cell. The frames and the cell shapes of each operand are:

@example
@verbatim
a: 3 |
b:   | 2
@end verbatim
@end example

Now the frames are rank-extended through prefix agreement.

@example
@verbatim
a: 3 |
b: 3 | 2
@end verbatim
@end example

Now we need to perform the operation on each cell. The verb @code{[](auto && a, auto && b) @{ return a*b; @}} has verb ranks (0, 0). This results in the 0-cells of @code{a} (which have shape ()) being rank-extended to the shape of the 1-cells of @code{b} (which is (2)).

@example
@verbatim
a: 3 | 2
b: 3 | 2
@end verbatim
@end example

This use of the rank conjunction is packaged in @code{ra::} as the @ref{x-from,@code{from}} operator. It supports any number of arguments, not only two.

@example
@verbatim
ra::Big<int, 1> a = {1, 2, 3};
ra::Big<int, 1> b = {40, 50};
ra::Big<int, 2> axb = from([](auto && a, auto && b) { return a*b; }), a, b)
@end verbatim
@result{} axb = @{@{40, 80, 120@}, @{50, 100, 150@}@}
@end example

Another example is matrix multiplication. For 2-array arguments C, A and B with shapes C: (m, n) A: (m, p) and B: (p, n), we want to perform the operation C(i, j) += A(i, k)*B(k, j). The axis alignment gives us the ranks we need to use.

@example
@verbatim
C: m |   | n
A: m | p |
B:   | p | n
@end verbatim
@end example

First we'll align the first axes of C and A using the cell ranks (1, 1, 2). The cell shapes are:

@example
@verbatim
C: m | n
A: m | p
B:   | p n
@end verbatim
@end example

Then we'll use the ranks (1, 0, 1) on the cells:

@example
@verbatim
C: m |   | n
A: m | p |
B:   | p | n
@end verbatim
@end example

The final operation is a standard operation on arrays of scalars. In actual @code{ra::} syntax:

@example @c [ma103]
@verbatim
ra::Big A({3, 2}, {1, 2, 3, 4, 5, 6});
ra::Big B({2, 3}, {7, 8, 9, 10, 11, 12});
ra::Big C({3, 3}, 0.);
for_each(ra::wrank<1, 1, 2>(ra::wrank<1, 0, 1>([](auto && c, auto && a, auto && b) { c += a*b; })), C, A, B);
@end verbatim
@result{} C = @{@{27, 30, 33@}, @{61, 68, 75@}, @{95, 106, 117@}@}
@end example

Note that @code{wrank} cannot be used to transpose axes in general.

I hope that in the future something like @code{C(i, j) += A(i, k)*B(k, j)}, where @code{i, j, k} are special objects, can be automatically translated to the requisite combination of @code{wrank} and perhaps also @ref{x-transpose,@code{transpose}}. For the time being, you have to align or transpose the axes yourself.

@c ------------------------------------------------
@node Compatibility
@section Compatibility
@c ------------------------------------------------

@subsection Using other C and C++ types with @code{ra::}

@cindex foreign type
@anchor{x-foreign-type}
@code{ra::} accepts certain types from outside @code{ra::} (@dfn{foreign types}) as array expressions. Generally it is enough to mix the foreign type with a type from @code{ra::} and let deduction work.

@example
@verbatim
std::vector<int> x = {1, 2, 3};
ra::Small<int, 3> y = {6, 5, 4};
cout << (x-y) << endl;
@end verbatim
@print{} -5 -3 -1
@end example

@cindex @code{start}
Foreign types can be brought into @code{ra::} explicitly with the function @ref{x-start,@code{start}}.

@example
@verbatim
std::vector<int> x = {1, 2, 3};
// cout << sum(x) << endl; // error, sum not found
cout << sum(ra::start(x)) << endl;
cout << ra::sum(x) << endl;
@end verbatim
@print{} 6
@end example

The following types are accepted as foreign types:

@itemize
@item Built-in arrays @cindex built-in array
produce an expression of positive rank and ct size.
@item @code{std::array}
produces an expression of rank 1 and ct size.
@item Other types conforming to @code{std::ranges::random_access_range}, including @code{std::vector}, @code{std::string}, etc.
produce an expression of rank 1 and rt size.
@end itemize

Raw pointers must be brought into @code{ra::} explicitly using the function @ref{x-ptr,@code{ptr}}, which produces an expression of rank 1 and @emph{undefined} size.

Compare:

@example @c [ma106]
@verbatim
int p[] = {1, 2, 3};
int * z = p;
ra::Big<int, 1> q {1, 2, 3};
q += p; // ok, q is ra::, p is foreign object with shape info
ra::start(p) += q; // can't redefine operator+=(int[]), foreign needs ra::start()
// z += q; // error: raw pointer needs ra::ptr()
ra::ptr(z) += p; // ok, shape is determined by foreign object p
@end verbatim
@end example

@anchor{x-is-scalar}
Some types are accepted automatically as scalars. These include:
@itemize
@item
Any type @code{T} for which @code{std::is_scalar_v<T>} is true, @emph{except} pointers. These include @code{char}, @code{int}, @code{double}, etc.
@item
@code{std::complex<T>}, if you import @code{ra/complex.hh}.
@end itemize

You can add your own types as scalar types with the following declaration (see @code{ra/complex.hh}):

@verbatim
    namespace ra { template <> constexpr bool is_scalar_def<MYTYPE> = true; }
@end verbatim

Otherwise, you can bring a scalar object into @code{ra::} on the spot, with the function @ref{x-scalar,@code{scalar}}.

@subsection Using @code{ra::} types with the STL

General @code{ra::} @ref{Containers and views,views} provide STL compatible @code{ForwardIterator}s through the members @code{begin()} and @code{end()}. These iterators traverse the elements of the array (0-cells) in row major order, regardless of the storage order of the view.

For @ref{Containers and views,containers} @code{begin()} provides @code{RandomAccessIterator}s, which is handy for certain functions such as @code{sort}. There's no reason why all views couldn't provide @code{RandomAccessIterator}, but these wouldn't be efficient for ranks above 1, and I haven't implemented them. The container @code{RandomAccessIterator}s that are provided are in fact raw pointers.

@example @c [ma106]
@verbatim
ra::Big<int> x {3, 2, 1}; // x is a Container
auto y = x(); // y is a View on x
// std::sort(y.begin(), y.end()); // error: y.begin() is not RandomAccessIterator
std::sort(x.begin(), x.end()); // ok, we know x is stored in row-major order
@end verbatim
@result{} x = @{1, 2, 3@}
@end example

@cindex other libraries, interfacing with
@subsection Using @code{ra::} types with other libraries

When you have to pass arrays back and forth between your program and an external library, perhaps even another language, it is necessary for both sides to agree on memory layout. @code{ra::} gives you access to its own memory layout and allows you to obtain an @code{ra::} view on any type of memory.

@subsubsection The good array citizen

@c FIXME Put these in examples/ and reference them here.

@cindex BLIS
The good array citizen describes its arrays with the same parameters as @code{ra::}: a pointer, plus a length and a step per dimension. You don't have to worry about special cases. Say @url{https://github.com/flame/blis, BLIS}:

@quotation
@verbatim
#include <blis.h>

ra::Big<double, 2> A({M, K}, ...);
ra::Big<double, 2> B({K, N}, ...);
ra::Big<double, 2> C({M, N}, ...);
double alpha = ...;
double beta = ...;

// C := βC + αAB
bli_dgemm(BLIS_NO_TRANSPOSE, BLIS_NO_TRANSPOSE, M, N, K, &alpha,
          A.data(), A.step(0), A.step(1),
          B.data(), B.step(0), B.step(1),
          &beta, C.data(), C.step(0), C.step(1));
@end verbatim
@end quotation

@cindex FFTW
Another good array citizen, @url{http://fftw.org, FFTW} handles arbitrary rank:

@quotation
@verbatim
#include <fftw3.h>

...

using complex = std::complex<double>;
static_assert(sizeof(complex)==sizeof(fftw_complex));

// forward DFT over the last r axes of a -> b
void fftw(int r, ra::View<complex> const a, ra::View<complex> b)
{
    int const rank = a.rank();
    assert(r>0 && r<=rank);
    assert(every(ra::start(shape(a))==shape(b)));
    fftw_iodim dims[r];
    fftw_iodim howmany_dims[rank-r];
    for (int i=0; i!=rank; ++i) {
        if (i>=rank-r) {
            dims[i-rank+r].n = a.len(i);
            dims[i-rank+r].is = a.step(i);
            dims[i-rank+r].os = b.step(i);
        } else {
            howmany_dims[i].n = a.len(i);
            howmany_dims[i].is = a.step(i);
            howmany_dims[i].os = b.step(i);
        }
    }
    fftw_plan p;
    p = fftw_plan_guru_dft(r, dims, rank-r, howmany_dims,
                           (fftw_complex *)(a.data()), (fftw_complex *)(b.data()),
                           FFTW_FORWARD, FFTW_ESTIMATE);
    fftw_execute(p);
    fftw_destroy_plan(p);
}
@end verbatim
@end quotation

@cindex Guile
Translating array descriptors from a foreign language should be fairly simple. For example, this is how to convert a @url{https://www.gnu.org/software/guile/manual/html_node/Accessing-Arrays-from-C.html#Accessing-Arrays-from-C,Guile} array view into an @code{ra::} view:

@quotation
@verbatim
    SCM a; // say a is #nf64(...)

    ...

    scm_t_array_handle h;
    scm_array_get_handle(a, &h);
    scm_t_array_dim const * dims = scm_array_handle_dims(&h);
    View<double> v(map([](int i) { return ra::Dim { dims[i].ubnd-dims[i].lbnd+1, dims[i].inc }; },
                       ra::iota(scm_array_handle_rank(&h))),
                   scm_array_handle_f64_writable_elements(&h));

    ...

    scm_array_handle_release(&h);
@end verbatim
@end quotation

@cindex Numpy
@cindex Python
Numpy's C API has the type @url{https://docs.scipy.org/doc/numpy/reference/c-api.array.html,@code{PyArrayObject}} which can be used in the same way as Guile's @code{scm_t_array_handle} in the example above.

It is usually simpler to let the foreign language handle the memory, even though there should be ways to transfer ownership (e.g. Guile has @url{https://www.gnu.org/software/guile/manual/html_node/SRFI_002d4-API.html#index-scm_005ftake_005ff64vector,@code{scm_take_xxx}}).

@subsubsection The bad array citizen

Unfortunately there are many libraries that don't accept arbitrary array parameters, or that do strange things with particular values of lengths and/or steps.

The most common case is that a library doesn't handle steps at all, and it only accepts unit step for rank 1 arrays, or packed row-major or column-major storage for higher rank arrays. In that case, you might be forced to copy your array before passing it along.

@c FIXME using is_c_order, etc.

Other libraries do accept steps, but not arbitrary ones. For example @url{https://www.netlib.org/blas}' @code{cblas_dgemm} has this prototype:

@quotation
@verbatim
cblas_dgemm(order, transA, transB, m, n, k, alpha, A, lda, B, ldb, beta, C, ldc);
@end verbatim
@end quotation

@code{A}, @code{B}, @code{C} are (pointers to) 2-arrays, but the routine accepts only one step argument for each (@code{lda}, etc.). CBLAS also doesn't understand @code{lda} as a arbitrary step, but rather as the dimension of a larger array that you're slicing @code{A} from, and some implementations will mishandle negative or zero @code{lda}.

Sometimes you can work around this by fiddling with @code{transA} and @code{transB}, but in general you need to check your array parameters and you may need to make copies.

@cindex OpenGL
OpenGL is another library that requires @url{https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glVertexAttribPointer.xhtml,contortions:}

@quotation
@verbatim
void glVertexAttribPointer(GLuint index,
                           GLint size,
                           GLenum type,
                           GLboolean normalized,
                           GLsizei step,
                           const GLvoid * pointer);
@end verbatim

[...]

@emph{step}

    @quotation
    Specifies the byte offset between consecutive generic vertex attributes. If step is 0, the generic vertex attributes are understood to be tightly packed in the array. The initial value is 0.
    @end quotation
@end quotation

It isn't clear whether negative steps are legal, either. So just as with CBLAS, passing arbitrary array views may require copies.

@c ------------------------------------------------
@node Extension
@section Extension
@c ------------------------------------------------

@subsection New scalar types

@code{ra::} will let you construct arrays of arbitrary types out of the box. This is the same functionality you get with e.g. @code{std::vector}.

@example
@verbatim
struct W { int x; }
ra::Big<W, 2> w = {{ {4}, {2} }, { {1}, {3} }};
cout << W(1, 1).x << endl;
cout << amin(map([](auto && x) { return w.x; }, w)) << endl;
@end verbatim
@print{} 3
   1
@end example

However, if you want to mix arbitrary types in array operations, you'll need to tell @code{ra::} that that is actually what you want. This is to avoid conflicts with other libraries.

@example
@verbatim
namespace ra { template <> constexpr bool is_scalar_def<W> = true; }
...
W ww {11};
for_each([](auto && x, auto && y) { cout << (x.x + y.y) << " "; }, w, ww); // ok
@end verbatim
@print{} 15 13 12 14
@end example

but

@example
@verbatim
struct U { int x; }
U uu {11};
for_each([](auto && x, auto && y) { cout << (x.x + y.y) << " "; }, w, uu); // error: can't find ra::start(U)
@end verbatim
@end example

@anchor{x-new-array-operations}
@subsection New array operations

@code{ra::} provides array extensions for standard operations such as @code{+}, @code{*}, @code{cos} @ref{x-scalar-ops,and so on}. You can add array extensions for your own operations in the obvious way, with @ref{x-map,@code{map}} (but note the namespace qualifiers):

@example
@verbatim
return_type my_fun(...) { };
...
namespace ra {
template <class ... A> inline auto
my_fun(A && ... a)
{
    return map(::my_fun, std::forward<A>(a) ...);
}
} // namespace ra
@end verbatim
@end example

@cindex overload set
If @code{my_fun} is an overload set, you can use@footnote{Simplified; see the references in @url{http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1170r0.html}.}

@example
@verbatim
namespace ra {
template <class ... A> inline auto
my_fun(A && ... a)
{
    return map([](auto && ... a) { return ::my_fun(a ...); }, std::forward<A>(a) ...);
}
} // namespace ra
@end verbatim
@end example

@cindex error
@cindex assert
@c ------------------------------------------------
@node Error handling
@section Error handling
@c ------------------------------------------------

Error handling in @code{ra::} is controlled by two macros:

@itemize
@item @code{RA_DO_CHECK}
is a binary flag that controls runtime checks. The default is 1 which means to check for errors. 0 means not to check. The checks themselves are done with @code{RA_ASSERT}.
@item @code{RA_ASSERT(cond, ...)}
is a function-like macro. @code{cond} is an expression that evaluates to true (in the @code{ra::} namespace) if the assertion is satisfied. The other arguments are informative and do not need to be used. If the assertion fails, the default definition of @code{RA_ASSERT(cond, ...)} prints those arguments to @code{std::cerr} and aborts.
@end itemize

@code{ra::} contains uses of @code{assert} for checking invariants or for sanity checks that are separate from uses of @code{RA_ASSERT}. Those can be disabled in the usual way with @option{-DNDEBUG}.

You can redefine @code{RA_ASSERT} to something that is more appropriate for your program. @code{examples/throw.cc} in the distribution shows how to throw a user-defined exception instead.

@c ------------------------------------------------
@node Extras
@chapter Extras
@c ------------------------------------------------


@c ------------------------------------------------
@node Hazards
@chapter Hazards
@c ------------------------------------------------

Some of these issues arise because @code{ra::} applies its principles systematically, which can have surprising results. Still others are the result of unfortunate compromises. And a few are just bugs.

@section Reuse of expression objects

Expression objects are meant to be used once. This applies to anything produced with @code{ra::map}, @code{ra::iter}, @code{ra::start}, or @code{ra::ptr}. Reuse errors are @emph{not} checked. For example:

@example
@verbatim
ra::Big<int, 2> B({3, 3}, ra::_1 + ra::_0*3); // {{0 1 2} {3 4 5} {6 7 8}}
std::array<int, 2> l = { 1, 2 };
cout << B(ra::ptr(l), ra::ptr(l)) << endl; // ok => {{4 5} {7 8}}
auto ll = ra::ptr(l);
cout << B(ll, ll) << endl; // ??
@end verbatim
@end example

@section Assignment to views

FIXME
With rt-shape containers (e.g. @code{Big}), @code{operator=} replaces the left hand side instead of writing over its contents. This behavior is inconsistent with @code{View::operator=} and is there only so that istream @code{>>} container may work; do not rely on it.

@section View of const vs const view

@c FIXME

FIXME
Passing view arguments by reference

@section Rank extension in assignments

Assignment of an expression onto another expression of lower rank may not do what you expect. This example matches @code{a} and 3 [both of shape ()] with a vector of shape (3). This is equivalent to @code{@{a=3+4; a=3+5; a=3+6;@}}. You may get a different result depending on traversal order.

@example @c [ma107]
@verbatim
int a = 0;
ra::scalar(a) = 3 + ra::Small<int, 3> {4, 5, 6}; // ?
@end verbatim
  @result{} a = 9
@end example

Compare with

@example
@verbatim
int a = 0;
ra::scalar(a) += 3 + ra::Small<int, 3> {4, 5, 6}; // 0 + 3 + 4 + 5 + 6
@end verbatim
  @result{} a = 18
@end example

@section Performance pitfalls of rank extension

In the following example where @code{b} has its shape extended from (3) to (3, 4), @code{f} is called 12 times, even though only 3 calls are needed if @code{f} doesn't have side effects. In such cases it might be preferrable to write the outer loop explicitly, or to do some precomputation.

@example
@verbatim
ra::Big<int, 2> a = {{1, 2, 3, 4}, {5, 6, 7, 8} {9, 10, 11, 12}};
ra::Big<int, 1> b = {1, 2, 3};
ra::Big<int, 2> c = map(f, b) + a;
@end verbatim
@end example

@section Chained assignment

FIXME
When @code{a=b=c} works, it operates as @code{b=c; a=b;} and not as an array expression.

@section Unregistered scalar types

FIXME
@code{View<T, N> x; x = T()} fails if @code{T} isn't registered as @code{is_scalar}.

@enumerate
@item
Item 0
@item
Item 1
@item
Item 2
@end enumerate

@c ------------------------------------------------
@node Internals
@chapter Internals
@c ------------------------------------------------

@code{ra::} has two main components: a set of container classes, and the expression template mechanism. The container classes provide leaves for the expression template trees, and the container classes also make some use of the expression template mechanism internally (e.g. in the selection operator, or for initialization).

@menu
* Header structure::
* Type hierarchy::
* Term agreement::
* Loop types::
* Introspection::
* Compiling and running::
@end menu

@c ------------------------------------------------
@node Headers
@section Headers
@c ------------------------------------------------

The header structure of @code{ra::} is as follows.@footnote{Diagram generated using Graphviz and @url{https://www.flourish.org/cinclude2dot}.

@verbatim
cd ra && cinclude2dot.pl --include . > headers.dot
dot -Tpng headers.dot -Gdpi=100 > headers.png
@end verbatim
}

@image{headers,4cm}

@c ------------------------------------------------
@node Type hierarchy
@section Type hierarchy
@c ------------------------------------------------

Some of the categories below are C++20 ‘concepts’, some are still informal.

@itemize
@item @b{Container} --- @code{Big}, @code{Shared}, @code{Unique}, @code{Small}

These are array types that own their data in one way or another. Creating or destroying these objects may allocate or deallocate memory, respectively.

@item @b{View} --- @code{View}, @code{SmallView}

These are array views into data in memory, which may be writable. Any of the @b{Container} types can be treated as a @b{View}, but one may also create @b{View}s that aren't associated with any @b{Container}, for example into memory allocated by a different library. Creating and destroying @b{View}s doesn't allocate or deallocate memory for array elements.

@item @b{Iterator} --- @code{CellBig}, @code{CellSmall}, @code{Iota}, @code{Ptr}, @code{Scalar}, @code{Expr}, @code{Pick}

This is a traversable object. @b{Iterator}s are accepted by all the array functions such as @code{map}, @code{for_each}, etc. @code{map} produces an @b{Iterator} itself, so most array expressions are @b{Iterator}s. @b{Iterator}s are created from @b{View}s and from certain foreign array-like types primarily through the function @code{start}. This is done automatically when those types are used in array expressions.

@b{Iterator}s have two traversal functions. The first one, @code{.adv(k, d)}, moves the iterator along any axis @var{k}. The other one, @code{.mov(d)}, is used on linearized views of the array. The methods @code{len()}, @code{step()}, @code{keep_step()} are used to determine the extent of these linearized views. In this way, a loop involving @b{Iterator}s can have its inner loop unfolded, which is faster than a multidimensional loop, especially if the inner dimensions of the loop are small.

@b{Iterator}s also provide an @code{at(i ...)} method for random access to any element of the expression.

@end itemize


@c ------------------------------------------------
@node Term agreement
@section Term agreement
@c ------------------------------------------------

The execution of an expression template begins with the determination of its shape — the length of each of its dimensions. This is done recursively by traversing the terms of the expression. For a given dimension @code{k}≥0, terms that have rank less or equal than @code{k} are ignored, following the prefix matching principle. Likewise terms where dimension @code{k} has undefined length (such as @code{iota()} or view axes created with @code{insert}) are ignored. All the other terms must match.

Then we select a order of traversal. @code{ra::} supports ‘array’ orders, meaning that the dimensions are sorted in a certain way from outermost to innermost and a full dimension is traversed before one advances on the dimension outside. However, currently (v@value{VERSION}) there is no heuristic to choose a dimension order, so traversal always happens in row-major order (which shouldn't be relied upon). @code{ply_ravel} will unroll as many innermost dimensions as it can, and in some cases traversal will be executed as a flat loop.

Finally we select a traversal method. @code{ra::} has two traversal methods: @code{ply_fixed} can be used when the rank and the traversal order are known at compile time, and @code{ply_ravel} can be used in the general case.

@c ------------------------------------------------
@node Loop types
@section Loop types
@c ------------------------------------------------

TODO

@c ------------------------------------------------
@node Introspection
@section Introspection
@c ------------------------------------------------

The following functions are available to query the properties of @code{ra::} objects.

@cindex @code{rank}
@anchor{x-rank}
@deftypefn @w{Function} rank_t rank e

Return the rank of expression @var{e}.

@end deftypefn

@cindex @code{shape}
@anchor{x-shape}
@deftypefn @w{Function} array shape e
@deftypefnx @w{Function} dim_t shape e k

The first form returns the shape of expression @var{e} as an array. The second form returns the length of axis @var{k}, i.e. @code{shape(e)[k]} ≡ @code{shape(e, k)}.

@var{array} might be a @ref{x-foreign-type,@code{foreign type}} (such as @code{std::array} or @code{std::vector}) instead of a @code{ra-ra} type.

@end deftypefn


@c ------------------------------------------------
@node Compiling and running
@section Compiling and running
@c ------------------------------------------------

The following @code{#define}s affect the behavior of @code{ra::}.

@itemize
@c FIXME The flag should only apply to dynamic checks.
@item @code{RA_DO_CHECK} (default 1):  Check shape agreement (e.g. @code{Big<int, 1> @{2, 3@} + Big<int, 1> @{1, 2, 3@}}) and random array accesses (e.g. @code{Small<int, 2> a = 0; int i = 10; a[i] = 0;}). See @ref{Error handling}.
@item @code{RA_DO_OPT} (default 1): Sets the default for all @code{RA_DO_OPT_XXX} flags.
@item @code{RA_DO_OPT_IOTA} (default 1): Perform immediately (beat) certain operations on @code{ra::Iota} objects. For example, @code{ra::iota(3, 0) + 1} becomes @code{ra::iota(3, 1)} instead of a two-operand expression template.
@item @code{RA_DO_OPT_SMALLVECTOR} (default 0): Perform immediately certain operations on @code{ra::Small} objects, using small vector intrinsics. Currently this only works on @b{gcc} and doesn't necessarily result in improved performance.
@end itemize

@code{ra::} comes with three kinds of tests: examples, proper tests, and benchmarks. @code{ra::} uses its own test and benchmark suites. Run @code{CXXFLAGS=-O3 scons} from the top directory of the distribution to build and run them all.  Alternatively, you can use @code{CXXFLAGS=-O3 cmake . && make && make test}. Like most expression template libraries, @code{ra::} is highly dependent on optimization by the compiler and will be orders of magnitude slower with @option{-O0} compared to @option{-O2}.

The following environment variables affect the test suite under SCons:

@itemize
@item @code{RA_USE_BLAS} (default 0): Use BLAS for @code{gemm} and @code{gemv} benchmarks.
@end itemize

@c TODO Flags and notes about different compilers

@c ------------------------------------------------
@node The future
@chapter The future
@c ------------------------------------------------

@section Error messages

FIXME

@section Reductions

FIXME

@section Etc

FIXME

@c ------------------------------------------------
@node Reference
@chapter Reference
@c ------------------------------------------------

@cindex @code{agree}
@anchor{x-agree} @defun agree arg ...
Return true if the shapes of arguments @var{arg...} match (see @ref{Rank extension}), else return false.

This is useful when @ref{Error handling,error checking} is enabled and one wants to avoid the failure response.
@example
@verbatim
    ra::Small<int, 2, 3> A;
    ra::Small<int, 2> B;
    ra::Small<int, 3> C;

    agree(A, B); // -> true
    static_assert(agree(A, B)); // ok for ct shapes
    cout << (A+B) << endl; // ok

    agree(A, C); // -> false
    cout << (A+C) << endl; // error. Maybe abort, maybe throw - cf Error Handling
@end verbatim
@end example
@end defun

@cindex @code{agree_op}
@anchor{x-agree_op} @defun agree_op op arg ...
Return true if the shapes of arguments @var{arg...} match (see @ref{Rank extension}) relative to operator @var{op}, else return false.

This differs from @ref{x-agree,@code{agree}} when @var{op} has non-zero argument ranks. For example:
@example
@verbatim
    ra::Big<real, 1> a({3}, 0.);
    ra::Big<real, 2> b({2, 3}, 0.n);

    agree(a, b); // -> false
    cout << (a+b) << endl; // error

    agree_op(ra::wrank<1, 1>(std::plus()), a, b); // -> true
    cout << map(ra::wrank<1, 1>(std::plus()), a, b) << endl; // ok
@end verbatim
@end example
@end defun

@cindex @code{at}
@anchor{x-at} @defun at expr indices
Look up @var{expr} at each element of @var{indices}, which shall be a multi-index into @var{expr}.

This can be used for sparse subscripting. For example:
@example @c [ra30]
@verbatim
    ra::Big<int, 2> A = {{100, 101}, {110, 111}, {120, 121}};
    ra::Big<ra::Small<int, 2>, 2> i = {{{0, 1}, {2, 0}}, {{1, 0}, {2, 1}}};
    ra::Big<int, 2> B = at(A, i);
@end verbatim
  @result{} B = @{@{101, 120@}, @{110, 121@}@}
@end example

@end defun

@cindex @code{cast}
@anchor{x-cast} @defun cast <type> expr
Create an array expression that casts @var{expr} into @var{type}.
@end defun

@cindex @code{collapse}
@anchor{x-collapse} @defun collapse
TODO
See also @ref{x-explode,@code{explode}}.
@end defun

@cindex @code{concrete}
@anchor{x-concrete} @defun concrete a
Convert the argument to a container of the same shape as @var{a}.

If the argument has rt or ct shape, it is the same for the result. The main use of this function is to obtain modifiable copy of an array expression without having to prepare a container beforehand, or compute the appropiate type.

@c FIXME example

@end defun

@cindex @code{diag}
@anchor{x-diag} @defun diag view
Equivalent to @code{transpose<0, 0>(view)}.
@end defun

@cindex @code{explode}
@anchor{x-explode} @defun explode
TODO

See also @ref{x-collapse,@code{collapse}}.
@end defun

@cindex @code{for_each}
@anchor{x-for_each} @defun for_each op expr ...
Create an array expression that applies @var{op} to @var{expr} ..., and traverse it.

@var{op} is run for effect; whatever it returns is discarded. For example:
@example
@verbatim
double s = 0.;
for_each([&s](auto && a) { s+=a; }, ra::Small<double, 1> {1., 2., 3})
@end verbatim
@result{} s = 6.
@end example

See also @ref{x-map,@code{map}}.

@end defun

@cindex @code{format_array}
@anchor{x-format_array} @defun format_array expr [last_axis_separator [second_last_axis_separator ...]]
Formats an array for character output.

For example:

@example
@verbatim
ra::Small<int, 2, 2> A = {{1, 2}, {3, 4}};
cout << "case a:\n" << A << endl;
cout << "case b:\n" << format_array(A) << endl;
cout << "case c:\n" <<  format_array(A, "|", "-") << endl;
@end verbatim
@print{} case a:
1 2
3 4
case b:
1 2
3 4
case c:
1|2-3|4
@end example

The shape that might be printed before the expression itself (depending on its type) is not subject to these separators.

See also @ref{x-noshape,@code{noshape}}, @ref{x-withshape,@code{withshape}}.

@end defun

@cindex @code{from}
@anchor{x-from} @defun from op expr ...
Create outer product expression. This is defined as

@display
E = from(op, e₀, e₁ ...) ⇒ E(i₀₀, i₀₁ ..., i₁₀, i₁₁, ..., ...) = op[expr₀(i₀₀, i₀₁, ...), expr₁(i₁₀, i₁₁, ...), ...].
@end display

For example:
@example
@verbatim
    ra::Big<double, 1> a {1, 2, 3};
    ra::Big<double, 1> b {10, 20, 30};
    ra::Big<double, 2> axb = from([](auto && a, auto && b) { return a*b; }, a, b)
@end verbatim
  @result{} axb = @{@{10, 20, 30@}, @{20, 40, 60@}, @{30, 60, 90@}@}
@end example

@example
@verbatim
    ra::Big<int, 1> i {2, 1};
    ra::Big<int, 1> j {0, 1};
    ra::Big<double, 2> A = {{1, 2}, {3, 4}, {5, 6}};
    ra::Big<double, 2> Aij = from(A, i, j)
@end verbatim
  @result{} Aij = @{@{6, 5@}, @{4, 3@}@}
@end example

The last example is more or less how @code{A(i, j)} is implemented for arbitrary subscripts (@pxref{The rank conjunction}).

@end defun

@cindex @code{imag_part}
@anchor{x-imag_part} @defun imag_part
Take imaginary part of a complex number. This can be used as reference.

For example: @c [ma115]

@example
@verbatim
ra::Small<std::complex<double>, 2, 2> A = {{1., 2.}, {3., 4.}};
imag_part(A) = -2*real_part(A);
cout << A << endl;
@end verbatim
@print{}
(1, -2) (2, -4)
(3, -6) (4, -8)
@end example

See also @ref{x-real_part,@code{real_part}}.
@end defun

@cindex @code{map}
@anchor{x-map} @defun map op expr ...
Create an array expression that applies callable @var{op} to @var{expr} ...

For example:
@example
@verbatim
ra::Big<double, 1> x = map(cos, std::array {0.});
@end verbatim
@result{} x = @{ 1. @}
@end example

@var{op} can return a reference. For example:

@example
@verbatim
ra::Big<int, 2> x = {{3, 3}, 0.};
ra::Big<int, 2> i = {0, 1, 1, 2};
ra::Big<int, 2> j = {1, 0, 2, 1};
map(x, i, j) = 1;
@end verbatim
@result{} x = @{@{0, 1, 0@}, @{1, 0, 1@}, @{0, 1, 0@}@}
@end example

Note that @code{map(x, i, j)} is @ref{x-subscript-outer-product,not the same} as @code{x(i, j)}.

@var{op} can be any callable. For example:
@example
@verbatim
struct A { int a, b; };
std::vector<A> v = {{1, 2}, {3, 4}};
ra::map(&A::a, v) = -ra::map(&A::b, v); // pointer to member
@end verbatim
@result{} v = @{@{-2, 2@}, @{-4, 4@}@}
@end example

See also @ref{x-for_each,@code{for_each}}.

@end defun

@cindex @code{pack}
@anchor{x-pack} @defun pack <type> expr ...
Create an array expression that brace-constructs @var{type} from @var{expr} ...
@end defun

@cindex @code{pick}
@anchor{x-pick} @defun pick select_expr expr ...
Create an array expression that selects the first of @var{expr} ... if @var{select_expr} is 0, the second if @var{select_expr} is 1, and so on. The expressions that are not selected are not looked up.

This function cannot be defined using @ref{x-map,@code{map}}, because @code{map} looks up each one of its argument expressions before calling @var{op}.

For example:
@example @c cf examples/readme.cc [ma100].
@verbatim
    ra::Small<int, 3> s {2, 1, 0};
    ra::Small<double, 3> z = pick(s, s*s, s+s, sqrt(s));
@end verbatim
  @result{} z = @{1.41421, 2, 0@}
@end example

@end defun

@cindex @code{ply}
@anchor{x-ply} @defun ply expr
Traverse @var{expr}. @code{ply} returns @code{void} so @var{expr} should be run for effect.

It is rarely necessary to use @code{ply}. Expressions are traversed automatically when they are assigned to views, for example, or printed out. @ref{x-for_each,@code{for_each}}@code{(...)}, which is equivalent to @code{ply(map(...))}, should cover most other uses.

@example
@verbatim
double s = 0.;
ply(map([&s](auto && a) { s+=a; }, ra::Small<double, 1> {1., 2., 3})) // same as for_each
@end verbatim
@result{} s = 6.
@end example

@end defun

@cindex @code{real_part}
@anchor{x-real_part} @defun real_part
Take real part of a complex number. This can be used as reference.

See also @ref{x-imag_part,@code{imag_part}}.
@end defun

@cindex @code{reverse}
@anchor{x-reverse} @defun reverse view axis
Create a new view by reversing axis @var{k} of @var{view}.

This is equivalent to @code{view(ra::dots<k>, ra::iota(ra::len, ra::len-1, -1))}.

This operation does not work on arbitrary array expressions yet. TODO FILL

@end defun

@cindex @code{size}
@anchor{x-size} @defun size a
Get the total size of an @code{ra::} object: the product of all its lengths.
@end defun

@c FIXME example

@cindex @code{stencil}
@anchor{x-stencil} @defun stencil view lo hi
Create a stencil on @var{view} with lower bounds @var{lo} and higher bounds @var{hi}.

@var{lo} and @var{hi} are expressions of rank 1 indicating the extent of the stencil on each dimension. Scalars are rank extended, that is, @var{lo}=0 is equivalent to @var{lo}=(0, 0, ..., 0) with length equal to the rank @code{r} of @var{view}. The stencil view has twice as many axes as @var{view}. The first @code{r} dimensions are the same as those of @var{view} except that they have their lengths reduced by @var{lo}+@var{hi}. The last @code{r} dimensions correspond to the stencil around each element of @var{view}; the center element is at @code{s(i0, i1, ..., lo(0), lo(1), ...)}.

This operation does not work on arbitrary array expressions yet. TODO FILL

@end defun

@cindex @code{swap}
@anchor{x-swap} @defun swap a b
Swap the contents of containers @var{a} and @var{b}.

Both containers must be of the same storage type. The containers may have different shapes, but if at least one of them is of ct rank, then both of them must have the same rank.

This function reuses @code{std::swap} for same-rank overloads, so it must not be qualified (i.e. use @code{swap(a, b)}, not @code{ra::swap(a, b)}).
@end defun

@example @c [ra30]
@verbatim
    ra::Big<int> a ({2, 3}, 1 + ra::_0 - ra::_1); // (1)
    ra::Big<int> b ({2, 3, 4}, 1 - ra::_0 + ra::_1 + ra::_2); // (2)
    swap(a, b);
    // as if (1) had b and (2) had a
@end verbatim
@end example

@cindex @code{transpose}
@anchor{x-transpose}
@defun transpose <axes ...> (view) | (axes, view)
Create a new view by transposing the axes of @var{view}. The number of @var{axes} must match the rank of @var{view}.

For example:

@example
@verbatim
    ra::Unique<double, 2> a = {{1, 2, 3}, {4, 5, 6}};
    cout << transpose<1, 0>(a) << endl;
@end verbatim
  @print{}
  3 2
  1 4
  2 5
  3 6
@end example

The rank of the result is @code{maxᵢ(axesᵢ)+1} and it may be smaller or larger than that of @var{view}. If an axis is repeated, the smallest of the dimensions of @var{view} is used. For example:

@example
@verbatim
    ra::Unique<double, 2> a = {{1, 2, 3}, {4, 5, 6}};
    cout << transpose<0, 0>(a) << endl; // { a(0, 0), a(1, 1) }
@end verbatim
  @print{}
  2
  1 5
@end example

If one of the destination axes isn't mentioned in @var{axes}, then it becomes a ‘dead’ axis similar to those produced by @ref{x-insert,@code{insert}}. For example:

@example
@verbatim
    ra::Unique<double, 1> a = {1, 2, 3};
    cout << ((a*10) + transpose<1>(a)) << endl;
@end verbatim
  @print{}
  3 3
  11 21 31
  12 22 32
  13 23 33
@end example

The two argument form lets you specify the axis list at runtime. In that case the result will have rt rank as well. For example: @c [ma117]

@example
@verbatim
    ra::Small<int, 2> axes = {0, 1};
    ra::Unique<double, 2> a = {{1, 2, 3}, {4, 5, 6}};
    cout << "A: " << transpose(axes, a) << endl;
    axes = {1, 0};
    cout << "B: " << transpose(axes, a) << endl;
@end verbatim
  @print{}
  A: 2
  2 3
  1 2 3
  4 5 6
  B: 2
  3 2
  1 4
  2 5
  3 6
@end example

This operation does not work on arbitrary array expressions yet. TODO FILL

@end defun

@cindex @code{where}
@anchor{x-where} @defun where pred_expr true_expr false_expr
Create an array expression that selects @var{true_expr} if @var{pred_expr} is @code{true}, and @var{false_expr} if @var{pred_expr} is @code{false}. The expression that is not selected is not looked up.

For example:
@example
@verbatim
    ra::Big<double, 1> s {1, -1, 3, 2};
    s = where(s>=2, 2, s); // saturate s
@end verbatim
  @result{} s = @{1, -1, 2, 2@}
@end example

@end defun

@cindex @code{wrank}
@anchor{x-wrank} @defun wrank <input_rank ...> op
Wrap @var{op} using a rank conjunction (@pxref{The rank conjunction}).

For example: TODO
@example
@verbatim
@end verbatim
  @result{} x = 0
@end example

@end defun

@c @anchor{x-reshape}
@c @defun reshape view shape
@c Create a new view with shape @var{shape} from the row-major ravel of @var{view}.
@c FIXME fill when the implementation is more mature...
@c @end defun


@c @anchor{x-ravel}
@c @defun ravel view
@c Return the ravel of @var{view} as a view on @var{view}.
@c FIXME fill when the implementation is more mature...
@c @end defun

@cindex @code{noshape}
@cindex @code{withshape}
@anchor{x-noshape}
@anchor{x-withshape}
@deffn @w{Special object} {withshape noshape}
If either of these objects is sent to @code{std::ostream} before an expression object, the shape of that object will or will not be printed.

If the object has ct shape, the default is not to print the shape, so @code{noshape} isn't necessary, and conversely for @code{withshape} if the object has rt shape. Note that the array readers [@code{operator>>(std::istream &, ...)}] expect the shape to be present or not according to the default.

For example:

@example
@verbatim
ra::Small<int, 2, 2> A = {77, 99};
cout << "case a:\n" << A << endl;
cout << "case b:\n" << ra::noshape << A << endl;
cout << "case c:\n" << ra::withshape << A << endl;
@end verbatim
@print{} case a:
77 99
case b:
77 99
case c:
2
77 99
@end example

but:

@example
@verbatim
ra::Big<int> A = {77, 99};
cout << "case a:\n" << A << endl;
cout << "case b:\n" << ra::noshape << A << endl;
cout << "case c:\n" << ra::withshape << A << endl;
@end verbatim
@print{} case a:
1
2
77 99
case b:
77 99
case c:
1
2
77 99
@end example

Note that in the last example the very shape of @code{ra::Big<int>} has rt length, so that length (the rank of @code{A}, that is 1) is printed as part of the shape of @code{A}.

See also @ref{x-format_array,@code{format_array}}.

@end deffn

@cindex @code{ptr}
@anchor{x-ptr}
@deffn @w{Function} ptr bidirectional_iterator [len]
@deffnx @w{Function} ptr bidirectional_range
Create rank-1 expression from foreign object.

If @code{len} is not given for @var{bidirectional_iterator}, the expression has undefined length, and it will need to be matched with other expressions whose length is defined. @code{ra::} doesn't know what is actually accessible through the iterator, so be careful. For instance:

@example
@verbatim
int pp[] = {1, 2, 3};
int * p = pp; // erase length
ra::Big<int, 1> v3 {1, 2, 3};
ra::Big<int, 1> v4 {1, 2, 3, 4};
v3 += ra::ptr(p); // ok, shape (3): v3 = {2, 4, 6}
v4 += ra::ptr(p); // undefined, shape (4): bad access to p[3]
// cout << (ra::ptr(p)+ra::iota()) << endl; // ct error, expression has undefined shape
cout << (ra::ptr(p, 3)+ra::iota()) << endl; // ok, prints { 1, 3, 5 }
cout << (ra::ptr(p, 4)+ra::iota()) << endl; // undefined, bad access at p[4]
@end verbatim
@end example

Of course in this example one could simply have used @code{pp} instead of @code{ra::ptr(p)}, since the array type retains shape information.

@example
@verbatim
v3 += pp; // ok
v4 += pp; // error checked by ra::, shape clash (4) += (3)
cout << (p + ra::iota()) << endl; // ok
@end verbatim
@end example

You don't need to use @code{ra::ptr} on STL containers and built in arrays, which are converted to rank-1 expressions of the right size automatically. See also @ref{x-start,@code{start}}.

@end deffn

@cindex @code{start}
@anchor{x-start} @defun start foreign_object
Create array expression from @var{foreign_object}.

@var{foreign_object} can be a built-in array (e.g. @code{int[3][2]}), a @code{std::random_access_range} type (including @code{std::vector} or @code{std::array}, @pxref{Compatibility}), an initializer list, or any object that @code{ra::} accepts as scalar (see @ref{x-is-scalar,@code{here}}). The resulting expresion has shape according to the original object. Compare this with @ref{x-scalar,@code{scalar}}, which will always produce an expression of rank 0.

Generally one can mix these types with @code{ra::} expressions without needing @code{ra::start}, but sometimes this isn't possible, for example for operators that must be class members.

@example
@verbatim
std::vector<int> x = {1, 2, 3};
ra::Big<int, 1> y = {10, 20, 30};
cout << (x+y) << endl; // same as ra::start(x)+y
// x += y; // error: no match for operator+=
ra::start(x) += y; // ok
@end verbatim
@print{} 3
  11 22 33
  @result{} x = @{ 11, 22, 33 @}
@end example

@end defun

@cindex @code{scalar}
@anchor{x-scalar} @defun scalar expr
Create scalar expression from @var{expr}.

The primary use of this function is to bring a scalar object into the @code{ra::} namespace. A somewhat artificial example:

@example
@verbatim
struct W { int x; }
ra::Big<W, 1> w { {1}, {2}, {3} };

// error: no matching function for call to start(W)
// for_each([](auto && a, auto && b) { cout << (a.x + b.x) << endl; }, w, W {7});

// bring W into ra:: with ra::scalar
for_each([](auto && a, auto && b) { cout << (a.x + b.x) << endl; }, w, ra::scalar(W {7}));
@end verbatim
@print{} 8
   9
   10
@end example

See also @ref{x-scalar-char-star,@code{this example}}.

Since @code{scalar} produces an object with rank 0, it's also useful when dealing with nested arrays, even for objects that are already in @code{ra::}. Consider:
@example
@verbatim
using Vec2 = ra::Small<double, 2>;
Vec2 x {-1, 1};
ra::Big<Vec2, 1> c { {1, 2}, {2, 3}, {3, 4} };
// c += x // error: x has shape (2) and c has shape (3)
c += ra::scalar(x); // ok: scalar(x) has shape () and matches c.
@end verbatim
  @result{} c = @{ @{0, 3@}, @{1, 4@}, @{2, 5@} @}
@end example
The result is @{c(0)+x, c(1)+x, c(2)+x@}. Compare this with
@example
@verbatim
c(ra::iota(2)) += x; // c(ra::iota(2)) with shape (2) matches x with shape (2)
@end verbatim
  @result{} c = @{ @{-1, 2@}, @{2, 5@}, @{2, 5@} @}
@end example
where the result is @{c(0)+x(0), c(1)+x(1), c(2)@}.

@end defun

@cindex @code{iter}
@anchor{x-iter} @defun iter <k> (view)
Create iterator over the @var{k}-cells of @var{view}. If @var{k} is negative, it is interpreted as the negative of the frame rank. In the current version of @code{ra::}, @var{view} may have rt or ct shape.

@example
@verbatim
ra::Big<int, 2> c {{1, 3, 2}, {7, 1, 3}};
cout << "max of each row: " << map([](auto && a) { return amax(a); }, iter<1>(c)) << endl;
ra::Big<int, 1> m({3}, 0);
scalar(m) = max(scalar(m), iter<1>(c));
cout << "max of each column: " << m << endl;
m = 0;
for_each([&m](auto && a) { m = max(m, a); }, iter<1>(c));
cout << "max of each column again: " << m << endl;
@end verbatim
@print{} max of each row: 2
   3 7
   max of each column: 3
   7 3 3
   max of each column again: 3
   7 3 3
@end example

@c [ma113]
In the following example, @code{iter} emulates @code{scalar}. Note that the shape () of @code{iter<1>(m)} matches the shape (3) of @code{iter<1>(c)}. Thus, each of the 1-cells of @code{c} matches against the single 1-cell of @code{m}.

@example
@verbatim
m = 0;
iter<1>(m) = max(iter<1>(m), iter<1>(c));
cout << "max of each column yet again: " << m << endl;
@end verbatim
@print{} max of each column again: 3
   7 3 3
@end example

The following example computes the trace of each of the items [(-1)-cells] of @code{c}. @c [ma104]

@example
@verbatim
ra::Small<int, 3, 2, 2> c = ra::_0 - ra::_1 - 2*ra::_2;
cout << "c: " << c << endl;
cout << "s: " << map([](auto && a) { return sum(diag(a)); }, iter<-1>(c)) << endl;
@end verbatim
@print{} c: 0 -2
   -1 -3
   1 -1
   0 -2
   2 0
   1 -1
   s: -3 -1 -1
@end  example

@end defun

@cindex @code{sum}
@anchor{x-sum} @defun sum expr
Return the sum (+) of the elements of @var{expr}, or 0 if expr is empty. This sum is performed in unspecified order.
@end defun

@cindex @code{prod}
@anchor{x-prod} @defun prod expr
Return the product (*) of the elements of @var{expr}, or 1 if expr is empty. This product is performed in unspecified order.
@end defun

@cindex @code{amax}
@anchor{x-amax} @defun amax expr
Return the maximum of the elements of @var{expr}. If @var{expr} is empty, return @code{-std::numeric_limits<T>::infinity()} if the type supports it, otherwise @code{std::numeric_limits<T>::lowest()}, where @code{T} is the value type of the elements of @var{expr}.
@end defun

@cindex @code{amin}
@anchor{x-amin} @defun amin expr
Return the minimum of the elements of @var{expr}. If @var{expr} is empty, return @code{+std::numeric_limits<T>::infinity()} if the type supports it, otherwise @code{std::numeric_limits<T>::max()}, where @code{T} is the value type of the elements of @var{expr}.
@end defun

@cindex @code{early}
@anchor{x-early} @defun early expr default
@var{expr} is an array expression that returns @code{std::optional<T>}. @var{expr} is traversed as by @code{for_each}. If the optional ever contains a value, traversal stops and that value is returned. If traversal ends, @var{default} is returned instead. If @code{default} is a reference, @code{early} will return its value. @c FIXME

The following definition of elementwise @code{lexicographical_compare} relies on @code{early}.

@example @c [ma108]
@verbatim
template <class A, class B>
inline bool
lexicographical_compare(A && a, B && b)
{
    return early(map([](auto && a, auto && b) { return a==b ? std::nullopt : std::make_optional(a<b); },
                     std::forward<A>(a), std::forward<B>(b)),
                 false);
}
@end verbatim
@end example

@end defun

@cindex @code{any}
@anchor{x-any} @defun any expr
Return @code{true} if any element of @var{expr} is true, @code{false} otherwise. The traversal of the array expression will stop as soon as possible, but the traversal order is not specified.
@end defun

@cindex @code{every}
@anchor{x-every} @defun every expr
Return @code{true} if every element of @var{expr} is true, @code{false} otherwise. The traversal of the array expression will stop as soon as possible, but the traversal order is not specified.
@end defun

@cindex @code{sqr}
@anchor{x-sqr} @defun sqr expr
Compute the square of @var{expr}.
@end defun

@cindex @code{sqrm}
@anchor{x-sqrm} @defun sqrm expr
Compute the square of the norm-2 of @var{expr}, that is, @code{conj(expr)*expr}.
@end defun

@cindex @code{conj}
@anchor{x-conj} @defun conj expr
Compute the complex conjugate of @var{expr}.
@end defun

@cindex @code{xI}
@anchor{x-xI} @defun xI expr
Compute @code{(0+1j)} times @var{expr}.
@end defun

@cindex @code{rel_error}
@anchor{x-rel-error} @defun rel_error a b
@var{a} and @var{b} are arbitrary array expressions. Compute the error of @var{a} relative to @var{b} as

@code{(a==0. && b==0.) ? 0. : 2.*abs(a, b)/(abs(a)+abs(b))}

@end defun

@cindex @code{none}
@anchor{x-none}
@deffn @w{Special object} {none}
Pass @code{none} to container constructors to indicate that the contents shouldn't be initialized. This is appropriate when the initialization you have in mind wouldn't fit in a constructor argument. For example:

@example
@verbatim
void old_style_initializer(int m, int n, double *);
ra::Big<double> b({2, 3}, ra::none);
old_style_initializer(2, 3, b.data());
@end verbatim
@end example

@end deffn

@c ------------------------------------------------
@node @mybibnode{}
@chapter Sources
@c ------------------------------------------------

@multitable @columnfractions .1 .9

@item @mybibitem{Abr70} @tab Philip S. Abrams. An APL machine. Technical report SLAC-114 UC-32 (MISC), Stanford Linear Accelerator Center, Stanford University, Stanford, CA, USA, February 1970.
@item @mybibitem{Ber87} @tab Robert Bernecky. An introduction to function rank. ACM SIGAPL APL Quote Quad, 18(2):39–43, December 1987.
@item @mybibitem{bli17} @tab The Blitz++ meta-template library. @url{http://blitz.sourceforge.net}, November 2017.
@item @mybibitem{Cha86} @tab Gregory J. Chaitin. Physics in APL2, June 1986.
@item @mybibitem{FI68} @tab Adin D. Falkoff and Kenneth Eugene Iverson. APL\360 User’s manual. IBM Thomas J. Watson Research Center, August 1968.
@item @mybibitem{FI73}  @tab Adin D. Falkoff and Kenneth Eugene Iverson. The design of APL. IBM Journal of Research and Development, 17(4):5–14, July 1973.
@item @mybibitem{FI78}  @tab Adin D. Falkoff and Kenneth Eugene Iverson. The evolution of APL. ACM SIGAPL APL, 9(1):30– 44, 1978.
@item @mybibitem{J S}   @tab J Primer. J Software, @url{https://www.jsoftware.com/help/primer/contents.htm}, November 2017.
@item @mybibitem{Mat}   @tab MathWorks. MATLAB documentation, @url{https://www.mathworks.com/help/matlab/}, November 2017.
@item @mybibitem{num17} @tab NumPy. @url{http://www.numpy.org}, November 2017.
@item @mybibitem{Ric08} @tab Henry Rich. J for C programmers, February 2008.
@item @mybibitem{SSM14} @tab Justin Slepak, Olin Shivers, and Panagiotis Manolios. An array-oriented language with static rank polymorphism. In Z. Shao, editor, ESOP 2014, LNCS 8410, pages 27–46, 2014.
@item @mybibitem{Vel01} @tab Todd Veldhuizen. Blitz++ user’s guide, February 2001.
@item @mybibitem{Wad90} @tab Philip Wadler. Deforestation: transforming programs to eliminate trees. Theoretical Computer Science, 73(2): 231--248, June 1990. @url{https://doi.org/10.1016/0304-3975%2890%2990147-A}

@end multitable

@c ------------------------------------------------
@node Indices
@unnumbered Indices
@c ------------------------------------------------

@c @node Concept Index
@c @unnumbered Concept Index
@printindex cp
@c @node Function Index
@c @unnumbered Function Index
@c @printindex fn

@c \nocite{JLangReference,FalkoffIverson1968,Abrams1970,FalkoffIverson1973,FalkoffIverson1978,APLexamples1,ArraysCowan,KonaTheLanguage,blitz++2001}

@c ------------------------------------------------
@node Notes
@unnumbered Notes
@c ------------------------------------------------

@enumerate
@item
@code{ra::} uses the non standard @code{#pragma once} (supported on all major compilers).
@end enumerate

@bye
